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1. INTRODUCTION 



The objective of the research documented in this dissertation is to investigate different 
static and dynamic logic families that can be implemented in a GaAs Complementary 
Hetrostructure Isolated Gate (CHIGFET) process. Key points in the analysis of different 
designs are to maximize speed and minimize both consumed power and layout area. In 
addition, the logic circuits should be suitable for VLSI implementations. The dynamic logic 
families discussed in this dissertation include Domino logic, N-P Domino logic and Two- 
Phase Dynamic Logic (TPDL). 

In this chapter, the history of the development of GaAs is introduced in Section A. A 
comparison between the electrical properties of GaAs and silicon is discussed in Section B. 
Section C discusses different GaAs devices. Section D discusses static logic families, while 
Section E explains well-known dynamic logic families. Section F outlines the rest of the 
dissertation. 

A. HISTORICAL REVIEW OF GALLIUM ARSENIDE (GaAs) 

Gallium Arsenide (GaAs) is a compound semiconductor that has been used since the 
1960’s for microwave amplifiers and optical-electronic devices. In the opto-electronic area, 
GaAs is used for light emitting diodes, solid state lasers, and optical sensors. GaAs 
transistors, both FET (Field Effect Transistor) and HBT (Hetrostructure Bipolar junction 
Transistor), are used for digital integrated circuits, primarily when the application requires 
very high speed and the delay and power requirements of silicon CMOS or bipolar ICs are 
too high. The high-frequency performance of GaAs digital ICs is excellent. The high 
electron velocity at moderate electric fields improves the high-frequency performance of 
GaAs transistors. The fast switching capabilities of these devices is based on their low 
internal capacitances and on their high electron velocity. 

The use of GaAs for digital applications began in 1974 with some relatively high- 
power, high-speed SSI divider circuits and has developed over the years into a well- 
established LSI technology, with some inroads into the VLSI arena. Initially, GaAs 



1 



integrated circuits appeared in digital fiber-optic communication systems [1, 2, 3, 4, 5]. 
However, the demand for faster computers requires faster logic circuits. The feasibility of 
using GaAs technology to build the next generation of computers has been demonstrated. 
Experimental GaAs RAMs, parallel multipliers, microprocessors and other computer 
circuits with a cycle time of a few nanoseconds or less have been reported [6, 7, 8, 9]. 

B. COMPARISON BETWEEN GALLIUM ARSENIDE (GaAs) AND SILICON 
(SI) ELECTRICAL PROPERTIES 

1. Electron Mobility 

The resistivity of a doped semiconductor is dependent upon the doping density (the 
number of charge carriers present within the material) and also upon the ease with which 
these carriers can move under the application of an electric field. This later property is 
known as the carrier mobility, which is defined as the ratio of the carrier velocity to the 
electric field strength. The primary advantage of GaAs over Si is that the electron mobility 
of GaAs is approximately five times that of Si, which offers a higher speed of operation 
than that achievable using Si devices [3, 10, 11, 12, 13, 14]. The disadvantage of GaAs is 
the relatively high power dissipation per logic gate during low-speed operation, small logic 
swing and correspondingly narrow noise margins. Low logic density, as a result of layout 
restrictions and low manufacturing yield, are also disadvantages of GaAs logic. 

Digital IC technologies are usually compared using a “power-delay product” 
parameter which is calculated by multiplying the logic gate delay by the power dissipated 
per gate. A high power delay product implies that either the gate power dissipation is high 
(limiting the integration density and increasing power consumption), or that the gate delay 
is high (limiting the speed performance of the technology). GaAs has a power-delay 
product advantage over Si in the vicinity of four to five times, which is principally 
determined by the high electron mobility. 
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2. Semi-insulating Substrate 

When semiconductor devices are fabricated together on an IC, it is important that any 
interaction between them which might negatively affect the performance of the complete 
circuit be kept to a minimum. Thus, the integrated components must be electrically isolated 
from each other. The resistivity of a semiconductor is dependent on the doping level. 
Silicon technologies generally use a doped N-type or P-type substrate with devices 
fabricated in oppositely doped “wells” to provide sufficient device isolation at relatively 
small separations. In comparison, the intrinsic resistivity of GaAs is several orders of 
magnitude higher, falling into the “semi-insulating” range [13, 15, 16]. This propeny 
allows GaAs devices to be fabricated in wells in un-doped substrate while still maintaining 
good isolation. The principal advantage of this is the reduced parasitic capacitance in GaAs 
ICs compared to that of silicon ICs [12, 13]. 

3. Radiation Hardness 

The immunity of an IC to damage from exposure to radiation is important for many 
applications. Military, space, and nuclear systems have certain requirements for radiation 
hardness. Other applications may also be affected if the IC is sensitive to radiation naturally 
occurring radiation from sources in the environment or packaging. GaAs has an advantage 
over silicon, depending on the type of radiation concerned. Radiation can ionize the atoms 
in a material. The total ionized charge produced depends on the total radiation dose. GaAs 
FETs have higher immunity than silicon ICs to total dose effects from ionizing radiation 
and only undergo a small change in their operating parameters [17]. GaAs has a high 
density of energy states at the surface which absorbs charge produced by radiation and thus 
minimizes changes in device parameters such as threshold voltage and parasitic resistance. 
It also prevents large surface leakage currents from occurring. The sensitivity of device 
characteristics to a radiation pulse is defined as the transient radiation hardness. The 
resulting transient radiation current may occur in two regions, within the active device or 
in the substrate [18]. In GaAs ICs, the dominant effects are due to substrate current. The 
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wider band gap of GaAs over that of Si (excluding Silicon On Sapphire-SOS) makes it 
harder to generate carriers and thus the sensitivity of GaAs to transient radiation is 
relatively low, provided that high quality, un-doped substrates are used. 

4. Hole Mobility 

The hole mobility of GaAs is lower than that of silicon by a factor of approximately 
five. Therefore, devices based upon the use of P-type GaAs will be slower than those based 
upon P-type silicon. Also, P-channel GaAs FETs have a higher channel resistance which 
degrades speed compared to silicon [13]. 

5. Other Properties 

One of the major disadvantages of Complementary GaAs, compared to silicon, is the 
cost factor. This high cost occurs for several reasons. The biggest cause of the relatively 
high cost of Complementary GaAs ICs is the fact that yields are low and wafer sizes are 
small compared to silicon [13]. This means that there are fewer working devices per wafer 
over which the processing costs can be shared.The material costs are higher (GaAs crystals 
are difficult to grow and IC production sometimes uses gold-based metallization). In 
practice, this means the use of Complementary GaAs is restricted to high speed 
applications where high cost can be tolerated in order to obtain a degree of performance 
that is not available from silicon ICs. 

Thermal conductivity is also an important property. High thermal conductivity is 
required to dissipate the heat generated by an IC during operation. GaAs has lower thermal 
conductivity than silicon. Thus, a GaAs IC runs hotter than a silicon IC when dissipating 
the same power. Also, thermal gradients across the surface of a GaAs chip are more severe. 
Thinning the substrate of GaAs ICs will aid in reducing this problem. 

C. GaAs DEVICES 

The Gallium Arsenide transistor with a diffusion gate structure, which was first 
reported in 1967, yields useful gain in the low megahertz frequency band. In 1969, the 
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silicon field effect transistor (FET) was developed with a 1 |im gate length. In 1971, a 
significant step was made when 1 |lm gate length FETs on GaAs were developed with a 
useful gain up to 18 GHz. 

Oxide growth has been tried on GaAs surfaces for more than 20 years. The quality of 
oxide grown on GaAs has been poor, and a high density of surface states results at the 
GaAs-insulator interface. These effects make it difficult to fabricate GaAs MOSFETs. 
Schottky barrier MESFETs and Junction Field Effect transistors (JFET) are examples of 
practical GaAs FETs. In many cases, these devices are fabricated by direct implantation 
into a GaAs semi-insulating substrate. Both enhancement and depletion MESFETs can be 
fabricated and each has advantages, depending on the application. The JFET is basically a 
voltage controlled resistor that employs a p-n junction as a gate to control the resistance, 
and thus the current that flows, between two ohmic contacts. The JFET has a lower 
switching speed than the MESFET because of the higher input edge capacitances in a 
planner JFET processes. The advantage of JFETs is that complementary logic is possible 
because n-p and p-n junctions can be fabricated on the same wafer. 

The development of advanced epitaxial growth techniques such as Molecular Beam 
Epitaxy (MBE) in the 1970s has enabled the fabrication of useful high quality 
semiconductor hetrostructures. The original concept of hetrojunction FETs (HFETs) came 
from experimental observation of enhanced electron mobility in modulation-doped 
hetrostructures. The term modulation doping has led to the name MODFET for the first 
generation of HFETs. 

1, Noise in Digital Circuits 

Noise in a digital circuit can be classified as either internal or external. External noise 
is generated outside the integrated circuit of interest, such as power supply ripple or 
electrostatic discharge. Internal noise is generated inside the integrated circuit of interest, 
such as mutual inductance and/or capacitance (cross-talk) between signal lines, inductive 
and resistive voltage spikes, power supply and ground leed voltage drop, interconnect 
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reflections, etc. A practical IC must be able to tolerate both types of noise in order to 
operate successfully and reliably in the intended application. 

The dc noise margins are the parameters that measure the ability of a circuit to operate 
error-free in a noisy environment. The noise margins of digital logic circuits can be 
measured in many different ways. Slope = -1 noise margin, intrinsic noise margin and 
maximum width noise margin are examples of the methods to measure noise margin of 
digital circuits. The factors that influence the noise margins are the voltage gain and the 
symmetry of the transfer characteristics of the circuit. As the voltage gain increases, the 
slope of the output voltage in the transition region increases and consequently the noise 
margins are increased. When the transfer curve of a digital circuit is symmetric, the noise 
margin low is equal to the noise margin high. 

The dynamic noise margin refers to the ability of a circuit to maintain a constant output 
when a short-duration noise pulse is present on the input. It depends on both the width and 
the magnitude of the input pulse. As the width of the input pulse decreases, a greater 
magnitude will be required to upset the output of the circuit and vise versa. 

2. Power Dissipation in GaAs Circuits 

When designing a GaAs digital circuit, the first parameters optimized are usually the 
noise margins. Then, the speed and the dissipated power can be optimized. GaAs digital 
circuits dissipate two types of power, static power and dynamic power. Static power is 
dissipated due to current flow from supply to ground or supply to supply during at least one 
logic state. Dynamic power is dissipated during switching due to the charging and 
discharging of the capacitive load. The total power dissipated by the circuit is the sum of 
both types. 

3. Depletion-Mode Logic Circuits 

N-type depletion-mode FETs (DFETs) are ON when Vqs= 0 and require a negative 
Vq 5 to cut off the flow of drain current. Vj )5 is usually kept positive all the time. Logic 
circuits containing only DFETs are characterized by unequal input and output voltage 
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levels and the need for level shifting networks. Both negative and positive signals are 
required, thus two power supplies are needed, leading to an increase in power dissipation 
and system-level design complexity. The use of two power supplies also makes this type of 
logic more susceptible to backgating effects. The backgating effect is the phenomenon of 
depleting the back side of the channel from charges when the substrate is negative with 
respect to the source, causing an increase in the threshold voltage which decreases the drain 
current and subsequently reduces the switching speed. The main advantage of DFET-only 
logic is the increased voltage transition which increases both the noise margin and the yield. 

4. Enhancement/Depletion Mode Logic Circuits 

This logic family requires both enhancement and depletion-mode FETs. Its main 
advantage over DFET-only logic is the equal input and output voltage levels. Therefore, 
there is no need for level shifting networks, which saves layout area. It also requires only 
one power supply with a value less than that of depletion-mode only logic circuits. The 
small logic transition provided by this type of logic leads to a higher speed and lower power 
dissipation compared to depletion-mode logic families. The main drawback of this family 
is that the small logic transition makes it sensitive to parameter variations, especially the 
threshold voltage. Therefore, it requires a uniformity of threshold voltage between 
components and between wafers. 

D. GaAs MESFET STATIC LOGIC CIRCUITS 

MESFET static logic families are ratioed logic, meaning that their high and low logic 
levels are determined by the width and length ratios of the load and switching roTs. GaAs 
MESFET static logic families dissipate static power, due to the current flow from the 
supply voltage to ground. Therefore, their performance is tied to constant power-delay 
curves. The main property that distinguishes the design of GaAs FET circuits from CMOS 
circuits is the forward bias gate conduction that results from the Schottky barrier at the gate/ 
channel junction of the FET. This gate conduction clamps the upper value of Vq^ (gate to 
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source voltage) to about 0.7 volts, the voltage required to forward bias the Schottky barrier 
diode. 

The design of GaAs digital logic circuits is a multidimensional problem. The circuit 
design problem is much more difficult than silicon MOS circuit design because there are 
few standards established for logic levels and supply voltages, and no preferred or 
obviously superior circuit topology has dominated design at the present time. The threshold 
voltages of both depletion and enhancement-mode reTs are different from one foundry to 
another, and a relatively wide variance from the mean threshold voltage is allowed. 
Therefore, the GaAs user must utilize a more general set of design methodologies than is 
necessary to complete the design of a standard NMOS or CMOS silicon IC [12]. 

The approach taken to optimize the design wiU depend on the application in which the 
circuit will be used. Speed and consumed power can be traded off over a range of about five 
to one for most circuits, without changing the threshold voltage. Speed and circuit tolerance 
to process variation, supply voltage or ground fluctuations, or temperature variations are 
also interchangeable to a degree because a circuit with a low logic transition will exhibit 
less delay than one that is designed with a larger logic transition and therefore is more 
robust in a digital system application [12]. The highest priority is assigned to the dc 
functionality of the circuit (noise margin and delay) over the expected range of process 
parameters and operating temperatures. Without satisfying this prerequisite, the maximum 
operating speed of the designed circuit has no significance. The following sections describe 
the relevant design details for the most commonly used static logic gate families. 

1. Directly-Coupled FET Logic (DCFL) 

DCFL has the lowest power consumption and highest logic function density. It is also 
the simplest form of the static logic families. It uses an E-MESFET for the switch device 
and D-MESFET for the load device. Figure 1.1 shows a DCFL inverter whose output logic 
high voltage will move towards Vdp when the switch is off, but be clamped by the forward 
biased gate-channel Schottky diode at the input of the next switch device. This limits the 
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output high voltage to about 0.7 V (Vqh = 0-7 V) and reduces the noise margins of this 
family. The key point for correct operation is to decrease the output low voltage (Vql) 
below the threshold voltage of the E-MESFET switch device. Design of the inverter 
involves scaling the device gate widths and lengths so as to get an acceptable logic low and 
a reasonable noise margin [2, 19, 20]. 

Due to its low noise margin, only basic logic gates can be reliably fabricated. NAND 
gates are not recommended in this logic family since they use two series E-MESFETs 
which degrades the noise margin. 
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Figure 1.1: Directly-Coupled FET Logic (DCFL) Inverter 



2. Buffered FET Logic (BFL) 

DCFL circuits are difficult to make due to the need to integrate enhancement and 
depletion mode MESFETs. The BFL family uses only D-MESFETs, Figure 1.2 shows a 
BFL inverter. The use of the D-MESFET as a switch device requires that the input voltage 
transition must be negative (below the threshold voltage) to tiu’n the device off. Because 
the output of the inverter can not fall below zero, a level shifting stage (Dl, D2, Q4) is used 
to shift the output negatively which requires another power supply (V 55 ). Q3 is a source- 
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follower used to increase the drive capability and increase the speed of this logic family 
[21]. Unbuffered logic is derived from BFL by eliminating the source follower Q3 and 
connecting the drain of Q1 directly to Dl. This can be done when the drive capability of 
BFL is not needed. 

In comparison to DCFL, BFL has a much higher power dissipation due to the extra 
level shifting stage and the use of two voltage rails. The buffering circuit provides a lower 
propagation delay and higher drive capability. Also, fabrication of BFL circuits is easier 
than that of DCFL because it uses only D-MESFETs and has much better tolerance to 
variations in MESFET characteristics. 



'dd 



Q2 



Q3 



Dl D2 Output 
1 



Input 



Q1 



I— > 


Q4 








V 



ss 



Figure 1.2: Buffered FET Logic (BFL) Inverter 



3. Schottky Diode FET Logic (SDFL) 

When the power dissipation of BFL is too high, the level shifting stage can be moved 
from the output to the input of the gate. The diodes can be used as switching elements rather 
than using them only for level shifting purposes, resulting in a new logic family called 
SDFL. The diodes have low capacitance, low series resistance and there is no minority 
carrier charge storage problems as with p-n junction diodes. Therefore, if used for 
switching, a saving in area is also possible. Figure 1.3 shows the circuit of a three inputs 
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SDFL NOR gate. In this family, the output of the level shifting stage will drive only one 
logic gate, less drive capability is required, much smaller devices can be used and 
correspondingly the power dissipation is reduced [22, 23, 24, 25]. 
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Figure 1.3: Three-input Schottky Diode FET Logic (SDFL) NOR Gate 
4. Source-Coupled FET Logic (SCFL) 

Source Coupled FET Logic (SCFL) uses a differential gate topology as shown in 
Figure 1.4. It is analogous to bipolar emitter coupled logic (ECL). SCFL is the fastest form 
of GaAs MESFET logic [26]. It uses differential amplifier circuits which provide the 
benefit of good common-mode rejection. This property is advantageous because any wafer- 
to-wafer variation in the threshold voltage becomes a common mode voltage and will not 
affect the switching threshold of the circuit. This leads to a design tolerable to FET 
threshold variation when compared to DCFL. Another advantage of differential circuits is 
their high transconductance, g^, leading to a high cut-off frequency and a better switching 
speed than DCFL. SCFL has good noise margins because it is differential. The circuit 
operates by steering a fixed current through a pair of switches and then using this current 
to develop a voltage drop across one of a pair of load devices [27]. The main drawback of 
SCFL is the high power consumption. Also, the differential circuit requires routing of all 
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variables and their complement, which increases the metallization and wiring area, leading 
to increases in the parasitic capacitances in the circuit. 
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Figure 1.4: Source-Coupled FET Logic (SCFL) Inverter (Differential Inputs) 

5. Alternative Logic Families 

There are many families other than DCFL, BFL, SDFL, and SCFL with various levels 
of performance. In the following sections, these families are discussed very briefly. 

a. Capacitor-Coupled Logic (CCL) 

This logic family uses a coupling capacitor between stages to give the necessary 
level shifting to drive D-MESFET-only logic. CCFL offers lower power consumption than 
BFL and SDFL due to the absence of any power consuming level shifting stage [28]. The 
capacitance in CCL is implemented with a reversed biased Schottky diode [29]. Figure 1.5 
shows the circuit diagram of two CCL cascaded inverters. 
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Figure 1.5: Capacitor-Coupled Logic (CCL) Inverters 



b. Capacitor Diode FET Logic (CDFL) 

In order to overcome the problem of dynamic only operation of the CCL family, 
capacitor diode FET logic has been devised. The dc level shifting stage is added in parallel 
with the capacitor. Since this stage is only required to provide coupling at low frequencies, 
the drive requirements are small and thus a very low power dissipation can be maintained 
in this extra stage [30]. The circuit diagram for this family is shown in Figure 1.6. 
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Figure 1.6: Capacitor Diode FET Logic (CDFT^) Inverter 
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c. Super-Buffer FET Logic (SBFL) 

The use of quasi-complementary output drivers (super-buffers) will give better 
current drive capability which improves the switching speed of DCFL family, providing a 
new circuit called super buffer FET logic (SBFL) [2, 31]. This family has a push-pull 
output stage to provide increased drive capability. The SBFL inverter is shown in Figure 
1.7. The disadvantage of SBFL is the power and ground current spikes that occur when the 
output waveform switches from high to low. The reason for the spikes is that both the 
source follower and the pull down transistors are ON at the same time. Power consumption 
for SBFL is also very high. 
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Figure 1.7: Super-Buffer FET Logic (SBFL) Inverter 
E. DYNAMIC LOGIC CIRCUITS 

Dynamic logic circuits have been used in silicon MOSFET technologies to decrease 
power dissipation and thus increase logic function complexity and circuit density. The 
basic dynamic gate consists of a N-channel transistor logic structure whose output node is 
precharged to through a clocked P-channel transistor and conditionally discharged to 
V 55 or ground through a switching N-channel transistor. Dynamic circuits require a clock 
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for proper operation. Dynamic logic is a non-ratioed logic, meaning that the logic levels are 
not determined by the width and length ratios of the load and switching transistors. This 
allows the design of dynamic logic circuits, in most cases, to use minimum device 
dimensions and results in a small layout area and high fan-in. The use of clocked transistors 
prevents the flow of current from power supply to ground at the same time, decreasing the 
static power dissipation. Dynamic circuits have two-phases of operation precharge and 
evaluation. In the precharge phase, the circuit nodes are charged or discharged to some 
reference level according to the design. Inputs of the gate can change only during the 
precharge phase. At the completion of the precharge phase, the path to Vdd is turned off 
and the path to ground is conditionally turned on by the clock signal. During the evaluation 
phase, these precharged nodes either float high or are pulled down according to the gate 
inputs. 

The main drawback of dynamic circuits is the need to route the clock signal to every 
gate in the circuit which complicates the routing problem and increases the parasitic 
capacitance. Also, they have a minimum frequency of operation because of the leakage 
current from the precharged nodes to ground. This drawback can be eliminated in Domino 
circuits by using a weak pull up P-channel transistor or a feedback transistor. The routing 
delay of the clock signal across an IC must be considered to prevent clock skew and 
metastability problems. Finally, charge redistribution problems must be taken care of while 
designing any dynamic logic circuit. A strong limitation of the simple dynamic structure, 
which uses only one clock, is the impossibility of cascading the logic blocks to implement 
complex logic [32]. Figure 1.8 illustrates the situation when cascading two stages of the 
simple dynamic structure. When the gates are precharged, the output nodes are charged to 
Vdd- During the evaluation phase, the output of the first gate will conditionally discharge. 
Due to the finite pull-down time, the precharged node (Nj) can discharge the output node 
of the following gate (N2) before the output of the first stage is correctly evaluated, causing 
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an erroneous state as shown in Figure 1.8. This problem can be eliminated through careful 
design of cascaded dynamic logic gates, as explained in the following subsections. 
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Figure 1.8; Erroneous Evaluation in Cascaded Dynamic CMOS Gates 




1. CMOS Dynamic Logic 
c. Domino Logic 

Domino logic modifies the simple dynamic structure by adding a static buffer 
(inverter) to each logic gate output. This allows a single clock to precharge and evaluate a 
cascaded set of dynamic domino logic blocks without entering an erroneous state [33]. The 
basic Domino logic gate is shown in Figure 1.9. The output of the dynamic gate goes only 
to the buffer and the output of the buffer is the logic gate output. During the precharge 
phase, the dynamic gate output is a logic high and the buffer output is a logic low. Also, all 
domino gate outputs are low, thus the transistors they drive are cutoff. During the 
evaluation phase, the domino logic gate output can only make a transition from low to high. 
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As a result, there can be no switching hazards at any node in the circuit because nodes can 
make at most a single transition and then must remain stable until the next precharge cycle. 
In a cascaded set of logic blocks, each stage evaluates and then causes the next stage to 
evaluate. Dynamic domino logic circuits have low power consumption because there is no 
dc path from Vjjd to ground, except for the static buffer. Also, the full pull-down current 
is available to drive the output nodes. At the same time, the load capacitance is much 
smaller than complementary circuits because most of the P-channel transistors have been 
eliminated from the load. Domino circuits use a single clock which provides a simple 
operation and full utilization of the speed of each gate. 

The limitation of this circuit technique is that all of the gates are non-inverting, 
meaning that it does not form a complete logic family. Another limitation is that each gate 
must be buffered by a static inverter, meaning that this technique is not completely dynamic 
and dissipates some static power. Finally, in common with all dynamic circuits, charge 
redistribution can be a problem. 







Figure 1.9: Dynamic Domino Logic Gate 



17 



To allow a lower frequency of operation and to avoid the risk of storing data on 
floating nodes, a low current, weak pull-up P-channel transistor can be added, in parallel 
with the main pull-up transistor. With the gate of the weak pull-up transistor grounded, the 
buffer input is pulled up during precharge. This will force the buffer output to be low, 
compensating for the leakage currents. The weak pull-up transistor must be small enough 
(small W/L) not to fight against discharging of the dynamic logic gate output node during 
evaluation. There is no significant impact on pull-down current and the power consumed 
during the evaluation phase is tolerable. In some applications, when the precharge time is 
long enough, the clocked P-channel transistor can be eliminated and substituted with the 
weak transistor [32]. Domino gates may also be made latching by including a feedback P- 
channel transistor from the output of the buffer to its input. This transistor is sometimes 
called a ‘not to forget transistor ‘. 

b. N-P Domino Logic 

The limitation of Domino logic is the lack of inverted logic functions. The 
combination of the dynamic block with a static inverter will give a non-inverted output 
signal. This decreases the logic flexibility and therefore may require more transistors per 
logic function. Another limitation in Domino circuits is the difficulty of pipelining multiple 
stages because all the logic blocks evaluate and precharge together. 

N-P Domino circuits solve the above two problems. The logic functions in N-P 
Domino circuits are implemented using N-type and P-type dynamic blocks, as shown in 
Figure 1.10. The output of the N-type is fed to the input of the P-type logic block and the 
output of the P-type is fed to the input of the N-type logic block, and so on [34, 35]. The 
static buffer is used only if the output of one type of logic is fed to the input of the same 
type. This dynamic logic family requires both the clock and its complement to drive N-type 
and P-type blocks respectively. The N-type logic block will precharge to VDD and the P- 
type block will pre-discharge to ground during their precharge phase. That is why this 
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implementation is sometimes called ‘zipper logic‘[36]. During precharge phase, all the 
transistors in n-logic and p-logic blocks will be turned off. 

The removal of the static inverter will reduce the static power dissipation and 
also make this family capable of producing inverted output. Zipper implementation make 
this family suitable for pipelining. The limitation of this logic family is the use of the slow 
P-channel transistors in evaluating the logic function. This will limit the maximum 
frequency of operation of the circuit, especially in gallium arsenide implementations. 
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c. T wo-Phase Dynamic Logic 

This family uses only N-channel transistors in evaluating the function. As the 
first stage evaluates its output node, the second stage will be precharging and vise versa. 
The two successive blocks are fed from two non-overlapping clock phases. A pass gate is 
used between successive logic blocks to isolate the data stored at the input of the second 
block from corruption when the output of the first block is precharging. Clocking of two- 
phase logic is shown in Figure 1.11. The two clock phases (j)i and (j )2 are non-overlapped in 
the logic high level [32]. The main disadvantage of this family is the use of extra 
metallization to run the two clock phases and their complements. 
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Figure 1.11: Clocking of Two-Phase Dynamic Logic 

2. GaAs MESFET Dynamic Logic Families 

In GaAs MESFET static logic circuits, direct current flows at all times from power 
supply to ground. Also, static logic is usually dependent on width ratios to control logic 
levels and noise margins. Therefore, these circuits are called ‘ratioed logic*. The speed and 
power dissipation are inversely proportional to the device widths. In static logic circuits, 
decreasing the power supply voltage will reduce the dissipated power but will reduce the 
output voltage transition which decreases the noise margin. Also, reducing the supply 
voltage will reduce the current flow in the circuit which decreases the switching speed. 
Shrinking the transistor-gate widths will save some layout area but will sacrifice both the 
drive capability and the speed of the circuit by reducing the current. 

Dynamic logic has been relatively un-exploited with GaAs FET technology. Earlier 
applications were mainly oriented toward very high speed SSI circuits such as divide by 
two and shift register circuits [37, 38, 39]. 
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The first Domino circuit was demonstrated in reference 40 and is a four-input AND 
gate. It is shown in Figure 1.12. It consists of three stages, an input stage, an inverting stage, 
and a level shifting stage and is composed entirely of depletion mode MESFETs. It requires 
two power supplies and two in-phase clock signals, level shifted with respect to each other. 
If Q1 and Q2 are enhancement-mode transistors, there will be only one clock and one 
power supply required for this circuit. This will decrease the noise margin of the circuit and 
decrease the consumed power. 

Another Domino circuit is presented in reference 41 and shown in Figure 1.13. It is 
called Capacitively Coupled Domino Logic (CCDL). It differs from that in Figure 1.12 in 
two regards. First, the order of the inverting and level shifting stages are reversed. Second, 
three different threshold MESFETs are required. However, current fabrication technology 
has enough difficulty controlling two different threshold MESFETs. 

The third Domino circuit is called Trickle Transistor Dynamic Logic (TTDL) 
presented in reference 42 and shown in Figure 1.14. This Domino topology uses static level 
shifting rather than capacitive, trading off the need for two clock signals for increased 
power dissipation. The major drawback of this design is that it requires four power 
supplies. In all of the above AND gate circuits, the propagation delay was measured to be 
around 200 ps and the power dissipated per gate was about 0_5 mW. Due to the previously 
mentioned drawbacks of the Domino circuits, this topology has seen limited use. 

The GaAs implementation of Two-phase Dynamic FET Logic (TDFL) is presented in 
references 43, 44, 45, 46 and 47. The schematic of two TDFL inverters in series is shown 
in Figure 1.15. The gates operate from a single power supply and two non -overlapping (in 
the logic high level) clocks. The precharge phase of the first gate is the evaluation phase of 
the second stage and vise versa. TDFL circuits are self latching and are suited for pipelined 
architectures. TDFL gates are non-ratioed which compacts the circuits layout. 

Another topology used in GaAs dynamic circuits is pass transistor logic. Application 
of Si MOS pass transistor topologies in GaAs MESFET circuits is not straight forward. 
MESFET gate conduction limits the signal levels on control gates and also limits noise 
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margins. Gate conduction requires different logic transitions on gate nodes than that on 
drain and source nodes [48], The circuits used to generate the control signal levels from the 
data signal levels are both area and power consuming, which limits the use of this topology. 
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Figure 1.12; GaAs MESFET Domino 3-Input AND Gate 
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Figure 1.14: GaAs MESFET TTDL 3-Input AND Gate 
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Figure 1.15: Two GaAs MESFET TDFL Inverters in Series 



23 



F. OUTLINE OF DISSERTATION 



In this dissertation, the possibility of using dynamic digital logic circuits with 
Complementary GaAs (CGaAs) fabrication processes is explored. Different dynamic 
circuit configurations are presented. The main problem of exploring these dynamic logic 
families in NGaAs technology is the gate conduction of GaAs transistors, as well as the 
absence of the PFETs. This problem is eliminated (partially) when using isolated gate 
CGaAs technology. In this case, the charge can be stored on the gate capacitor of the 
transistor for a longer time. 

Chapter I has been an overview of static logic families (ratioed logic), as well as 
dynamic logic families implemented in CMOS and NGaAs technologies. The theory of 
operation and the characteristics of the Complementary Hetrostructure Isolated Gate Field 
Effect Transistor (CHIGFET) is explained in Chapter n. The power consumption of the 
circuits implemented in CGaAs is much lower than the NGaAs technology, which 
introduces the CGaAs technology to the LSI and VLSI regions. The main problem of 
CGaAs logic is that the PFET is much slower than the NFET, which slows down this new 
technology. 

Various dynamic logic families are designed, analyzed and/or implemented in the 
described research. In these dynamic logic families, the slow PFETs are used only to 
precharge the output nodes. PFETs are not used in the evaluation blocks, thus the speed of 
these dynamic logic families is much higher than the standard static CGaAs logic family. 
There is no direct path from the supply to the ground at any time. Therefore, the static 
power dissipation of these families is very low. 

Chapter III discusses the design of the basic combinational logic gates (Inverter, 
NAND, NOR, XOR, XNOR gates) using both the static logic and the Two-Phase Dynamic 
FET Logic (TPDL). Also in this chapter, the comparison in speed, power consumption, and 
layout area between these two families is detailed [49]. The TPDL circuits require two non- 
overlapped clock phases and their complements for proper operation. The design and 
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analysis of a clock generator is also explained in this chapter. This chapter is the first 
section that details the original contributions of the described research. 

Chapter IV explains the design of the basic sequential circuits (D-latch, D flip-flop and 
linear feedback shift register). The design of these circuits is performed using the static and 
TPDL logic families [49]. This chapter also contains the comparison in performance 
between the two different logic families. 

In Chapter V, four different two-level functions are designed using static logic and all 
the well-known dynamic logic families. The dynamic logic families used are Domino logic, 
N-P Domino logic, and TPDL logic. These four circuits are simulated using HSPICE to 
differentiate among the different logic families in speed, power consumption and layout 
area. Based on the simulation results, it has been found that the performance of the TPDL 
family is superior to all the other static and dynamic logic families [50]. 

Chapter VI explores the feasibility of using the TPDL in practical circuit designs 
through implementation of complex functional blocks. In this chapter, a TPDL Four-Bit 
Carry Lookahead Adder (4-Bit CLA) is designed, optimized and analyzed [5 1]. The design 
of the same adder using the static logic and the pipelined static logic are completed. Also, 
a performance comparison among the three designed circuits is performed. 

In Chapter VII, the implementations of the circuits designed and analyzed in the 
previous chapters is completed. Seven integrated circuits are implemented. All ICs are 
designed to drive a 50 Q resistive load and a 15 PF parasitic capacitance. The design of the 
input receiver and output driver circuits, required for operation and testing of all the 
implemented ICs, is also explained. HSPICE simulation results of all the implemented 
chips, including the drivers, are also presented. The design data base of these chips is 
compatible with the Motorola Semiconductor CHIGFET fabrication process. HSPICE 
simulation files of aU the implemented chips are included in Appendix A, while the circuit 
layout of these chips is presented in Appendix B. 

Conclusions and suggestions for further work are explained in chapter Vin at the end 
of the dissertation. It summarizes the main contribution of this dissertation which is the 



25 



development and implementation of a new dynamic logic family, Two-Phase Dynamic 
FET Logic (TPDL), for complementary gallium arsenide (CGaAs) fabrication 
technologies. The TPDL family enhances the advantages of CGaAs technology by 
increasing the speed and decreasing the power consumption. It also reduces the layout area. 
The results of the described research allow CGaAs technology to be used for implementing 
VLSI ICs for the first time. 
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II. COMPLEMENTARY GALLIUM ARSENIDE (CGaAs) 

TECHNOLOGY 



Complementary metal oxide semiconductor (CMOS) field effect transistors are now 
the dominant technology for VLSI integrated circuit. This dominance is due to the fairly 
high speed and low power consumption of CMOS logic circuits. CGaAs is the compound 
semiconductor analog of silicon CMOS technology. Also, CGaAs can be used as an 
alternative process for BiCMOS because it offers higher speed and lower speed than 
BiCMOS [52]. 

In this chapter, Complementary Hetrostructure Insulated Gate FET (CHIGFET) 
devices, provided by Motorola Semiconductor, are investigated for implementing dynamic 
logic circuits. Section A provides an overview of CGaAs technology, discussing the 
problems associated with this technology and their solutions. Section B discusses 
applications for this technology. The internal structure and the fabrication process of 
CHIGFETs is explained in Section C. Section D discusses the gate current in CHIGFET 
and associated limitations. Finally, Section E explains modeling of both N- and P-channel 
devises using HSPICE simulation tools and discusses their performance. 

A. OVERVIEW OF COMPLEMENTARY GaAs 

Several problems have to be solved before CGaAs technology will be more generally 
useful for a wide variety of applications. The first problem is the low hole mobility in GaAs, 
which diminishes the speed advantage of this technology because speed becomes limited 
by the speed of P-channel devices. The second problem is the gate leakage current that 
diminishes the advantage of low power consumption in CHIGFETs because it increases the 
consumed static power. Also, it limits the input gate voltage transition which leads to a 
small noise margin in complementary logic circuits. The third problem is the increased 
source series resistance which increases the gate to source voltage which in turns reduces 
the transconductance. The fourth problem is the subthreshold current needs to be reduced 
to reduce the power consumption in CHIGFETs and increase the scale of integration. A 
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reduction in the subthreshold currenHs important, especially in sub-micrometer structures, 
because the subthreshold current increases sharply for gate lengths less than 1 |im [53], 

Solutions to the above problems have been tried through the use of CGaAs technology 
based on N-channel and P-channel AlGaAs/GaAs hetrostructure devices. These devices 
offer the high speed, ultra low power consumption and high noise margin suitable for VLSI 
circuits. Also, a quantum- well P-channel AlGaAs/InGaAs/GaAs HIGFET has been 
designed and fabricated to give better performance than AlGaAs/GaAs devices [53, 54]. 
The gate leakage current for both N- and P- channel AlGaAs/InGaAs /GaAs quantum-well 
devices is reduced because they have a larger valance and conduction band discontinuity at 
the AlGaAs/InGaAs interface, compared to AlGaAs/GaAs. Also, the energy band 
discontinuity at the InGaAs/GaAs interface helps to reduce the subthreshold current by 
confining the channel [55]. 

In the absence of dopants, the threshold voltage of the HIGFET is difficult to control. 
In order to change the threshold voltage in a controlled way, dopants must be introduced 
into the HIGFET structure [56], Dopants can be placed into the wide band semiconductor 
separating the channel from the gate, as done in a MODFET. Also, they can be placed into 
the semiconductor behind the channel as done in an inverted MODFET. Finally, they can 
be placed directly into the channel as done in a doped channel HIGFET. The latest case has 
the advantage of having better control of the threshold voltage [57]. Also, the subthreshold 
current and output conductance are reduced in this latest case. 

AlGaAs/InGaAs/GaAs Quantum well Doped Channel FETs have been documented in 
[58]. They have a large transconductance, a high mobility, a large scale of integration and 
can be used to implement logic gates with very short propagation delays. Delta-Doped 
CHIGFET are introduced in [59] and make use of a high InAs mole fraction in a 
pseudomorphic InGaAs channel along with a sub-channel delta-doped silicon layer to 
adjust the N-HIGFET and P-HIGFET threshold voltage. This results in a high 
transconductance value which increase the switching speed with low power consumption. 
Also, a high mole fraction in AlGaAs barrier layer is demonstrated in [60] and is used to 
reduce the gate leakage current of both N- and P-HIGFETs. In addition, the use of such a 
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high AlAs mole fraction results in reduced subthreshold currents and lowers the drain-to- 
gate leakage current in P-HIGFETs. Some test circuits have been fabricated using 
CHIGFET technology by Honeywell and Motorola and gave promising results. 

B. APPLICATIONS FOR COMPLEMENTARY GaAs 

Complementary GaAs (CGaAs) technology has already achieved 0.01|lW/MHz/gate 
at 0.9 V [52]. By sacrificing some power dissipation, a 1 GHz signal processor has been 
made, as well as full-complementary digital ICs that operate at 500 MHz using the same 
process flow. These circuits have a speed/power measurement of 0.16 jlW/MHz/gate [52]. 
The performance of 1 jlm CGaAs has been shown to be superior to 0.5 |im CMOS or thin 
film SOI equivalents through the measurement of ring oscillator delay versus supply 
voltage for these technologies [52]. In addition, CGaAs circuits are more tolerable to 
threshold variations across the wafer and from wafer to wafer resulting in higher yield [61, 
63, 64]. Due to the low threshold voltage of the CGaAs devices and high current drive, its 
performance is very good, even at low power supply voltages [52]. 

Complementary GaAs is finding applications where the required circuit performance 
is greater than CMOS can provide at a given power dissipation, or where additional speed 
is required when compared to existing CMOS devices. Significant power savings may be 
obtained for CGaAs at high clock rates. The types of applications where CGaAs can be 
used include high performance microprocessors, low-power RISC processors for portable 
applications, digital signal processors, and fast static memory. Complementary GaAs is 
also useful for space system applications due to its inherent total dose and dose rate 
radiation hardness. 

C. GaAs CHIGFET STRUCTURE AND FABRICATION 

The structure of a CHIGFET is analogous to that of a MOSFET with the un-doped 
AlGaAs taking the function of the oxide in CMOS. It is grown by molecular beam epitaxy 
(MBE) on a semi-insulating GaAs substrate as shown in Figure 2. 1 (copied from reference 
[52] and not drawn in scale). In this structure, an un-doped GaAs buffer is grown on the 
semi-insulating substrate, followed by an InGaAs channel, then followed by an un-doped 
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AlGaAs dielectric and finally ending in a thin un-doped GaAs cap layer. A Wsi gate is then 
deposited on the structure, and serves as a self-aligned mask for the source and drain 
implants. After annealing the implant, ohmic contacts are deposited on both source and 
drain. The device may be either N-channel or P-channel depending on source and drain 
implants. The channel may also be delta-doped to control the threshold voltage. The rest of 
the structure is entirely un-doped. The fabrication process of CHIGFET requires only 13 
masks [52]. 
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Figure 2.1: GaAs CHIGFET Transistor Structure 



D. GaAs CHIGFET GATE CURRENT 

GaAs CHIGFET structure is analogous to the MOSFET structure with the AlGaAs 
layer playing the role of the oxide. In silicon MOSFET technology, the charge carriers are 
confined to the channel by the silicon dioxide. On the other hand, the CHIGFET relies on 
the band discontinuity between the InGaAs channel and the AlGaAs dielectric to confine 
the charge carriers. Because this band discontinuity is considerably smaller than that in a 
Si-Si02 interface, the gate current in CHIGFET is significant, whereas it is negligible in 
most applications in silicon MOSFETs. 



30 



The gate current limits the usable range of gate voltage. If gate current is high, it 
creates an additional voltage across the source series resistance, which causes a sharp drop 
in the device transconductance. Also, increasing the gate voltage will increase the gate 
current, which dramatically increases the power consumed by the circuit without increasing 
the switching speed. 

E. GaAs CHIGFET I-V CHARACTERISTICS 

In this section, the GaAs CHIGFETs that can be fabricated by Motorola are simulated 
using HSPICE simulation tools. The transistor parameters can be found in [52] and have 
been supplied by Motorola. These parameters have been extracted from actual fabricated 
devices. Figures 2.2 through 2.5 show the drain current versus drain to source voltage for 
different width to length ratios of both N- and P-channel transistors. It is clear from these 
figures that the transistors have three regions of operations. The cut-off region where Vos 
< Vj, the linear region where < Vjjs. and the saturation region where Vq^-Vj > 

Vj )5 (same regions as the MOSFET transistors). As shown in these figures, the drain 
current increases with different rates as Vos increases, the rate getting small as Vos 
exceeds 2.0 volts. Therefore, the best region of operation for this transistor will be if Vqs 
is kept below 2.0 volts. It can be noticed that the drain current for the N-channel transistor 
is about three times that of the P-channel transistor for the same gate to source voltage 
(Vgs)- Also, it can be noticed that the drain current is doubled when the width to length 
ratio of a transistor is doubled. These are the N- and P-channel transistor models that will 
be used in all the circuit designs in the following chapters. In our circuit designs in the 
following chapters, gate voltage transition will be limited to 1.75 volts to reduce the gate 
leakage current. 

The transistors introduced in this chapter will be used in all the circuit designs 
presented in the following chapters. Chapter HI includes the design and implementation of 
the basic combinational logic circuits using both TPDL and static logic. The performance 
comparison between both designs is also presented Chapter HI. 
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Figure 2.4: 1-V Characteristics of P-Channel GaAs HIGFET Transistor (W=10 |im) 




Figure 2.5: 1-V Characteristics of P-Channel GaAs HIGFET Transistor (W=20 [im) 
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III. DESIGN AND ANALYSIS OF CGaAs STATIC AND DYNAMIC 
COMBINATIONAL LOGIC GATES 



Logic circuits for digital systems may be combinational or sequential. A combinational 
circuit consists of logic gates with outputs that at any time are determined directly by the 
present combination of inputs, regardless of previous inputs. A combinational circuit 
performs a specific information-processing operation, fully specified logically by a set of 
Boolean functions. Sequential circuits employ memory elements in addition to logic gates. 
Their outputs are functions of inputs and the state of the memory elements. The state of the 
memory elements, in turn, are functions of the previous inputs. As a consequence, the 
outputs of a sequential circuit depend not only on the present inputs but also on the past 
inputs. The circuit behavior must be specified by a time-sequence of inputs and internal 
states. Combinational circuit designs are the subject of this chapter, while sequential circuit 
designs is discussed in the next chapter. 

In the previous two chapters, previous work done by other people was reviewed and 
investigated. This chapter starts the new material of the dissertation. 

In this chapter, the combinational logic gates, which are the basic building blocks of 
any logic circuit, are designed using both CGaAs static and dynamic circuit topologies. The 
principle dynamic logic family of interest in this chapter is Two-Phase Dynamic Logic 
(TPDL). This choice is based on the study of the different dynamic logic families, which is 
explained in Chapter V. The basic logic gates that will be discussed here are the Inverter, 
NAND gate, NOR gate, XOR gate and XNOR gate. In Section A, CGaAs static designs of 
these logic gates are studied. Also in this section, power and speed measurements of the 
CGaAs are carried out through the design of a ring oscillator. Section B explains the 
CGaAs TPDL deign of the same gates mentioned above. The loading and the power supply 
effects on the maximum operating frequency of these designs are also explained in each of 
the above sections. The design of a clock generator that generates two non-overlapped 
clock phases and their complements is explained in detail in Section C. These clock phases 
are required for the proper operation of the TPDL circuits. Section D explains the 
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comparison of the designed static gates and TPDL gates in speed, power consumed, and 
layout area. 

Circuits designed in this dissertation are simulated using HSPICE simulation tools. 
The size of any circuit simulated by HSPICE is limited only by the virtual memory of the 
computer being used. HSPICE has a superior convergence and accurate modeling features. 
It produces a graph data file which is displayed by the Graphic Simulation Interface (GSI) 
tools. MEASURE statement is used in the HSPICE net-list file to measure the required 
circuit parameters (such as the average power consumption of a circuit). All average power 
consumption figures calculated in this dissertation are measured using the HSPICE 
MEASURE statement. The calculation is made by integrating the current drawn from the 
power supply over the simulation time (which is the total charge Q). Then, the total charge 
is divided by the simulation time (which gives the average current) and multiplied by the 
supply voltage. 

A. CGaAs STATIC CIRCUIT DESIGN 

The design methodology used in this section is for CGaAs static logic gates and is very 
similar to that of CMOS logic gates. The design priority emphasizes reliability over speed. 
Therefore, the noise margins are optimized. Equal low and high noise margins, if possible, 
will provide the best circuit reliability and should be the design goal. Width ratios of PFETs 
and NFETs are chosen for the best noise margins with a fan-out of two (two inverter load). 
Changing the gate widths of the transistors while keeping the same width ratio will effect 
the speed of the gates (pull-up and pull-down times). 

1. Static Inverter Circuit Design 

The schematic of a CGaAs static inverter is shown in Figure 3.1. All transistor gate 
lengths are 0.7 |J.m and transistor gate widths in microns are indicated on the diagram.The 
design of this gate has been tried for different width ratios of the NFET to the PFET (WJ 

Wp) to get the best noise margin. Width to length ratios (W/L) of both the NFET and the 
PFET are changed, keeping a constant (WyWp) to get the best drive capability of the circuit 
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with a fan-out of two. The inverter test circuit in Figure 3.2 consists of the inverter under 
test preceded by two inverters to serve as a pulse shaping circuit. The inverter under test is 
followed by two inverters as an. output loading circuit. This test circuit was simulated in 
HSPICE and Figure 3.3 shows the DC transfer characteristic of the inverter from the 
HSPICE simulation program. This characteristic curve was generated at a supply voltage 
of 2.0 volts and input signal transitions are between 0.0 and 1.75 volts. 
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Figure 3.1: CGaAs Static Inverter 




Circuit 

Input 



Inverter 

Input 




Inverter 

Output 




Pulse Inverter 

Shaping Under Test 

Circuit 




Loading 

Circuit 



Figure 3.2; CGaAs Static Inverter Test Circuit 

The inverter DC transfer curve shown in Figure 3.3 consists of five regions. The first 
region is characterized by 0<Vjjj<V^, where the NFET is cut off and the PFET is in the 
linear region. The second region is defined by <Vjjj < V^yoll, where the PFET is still 

in the linear region and the NFET is in the saturation region. In the third region, both the 
NFET and the PFET are in the saturation region. In this region, the output switches from 
high to low. The value of the input voltage at which the inverter output is switched depends 
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on the width ratio between the NFET and the PFET (Wn/Wp). This is the region at which 
the inverter consumes short-circuit power because of the direct path of current from ^dd to 
ground through the ON transistors. The fourth region is described by Vpjy2 <Vj^< Vpp - 
IVtpl. The NFET is in the linear region and the PFET is in the saturation region. The last 
region is defined by > V^d - where the NFET is in the linear region and the PFET 
is in the cut-off region. These five regions are identical to those for the CMOS inverter 
described in reference 32. Vq, and V^p are the threshold voltages for N- and P-channel 
transistors respectively. 
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Figure 3.3: DC Transfer Characteristics of CGaAs Static Inverter 



The best results for noise margins are obtained when both N- and P-FETs have the 
same gate widths (Wj/Wp=l), which agrees with the results mentioned in reference 61. 
From Figure 3.3, the noise margin high (NMjj) is measured to be 0.8 volts and the noise 
margin low (NMj) is 0.9 volts at = Wp = 10 (im and L = 0.7 [Im with a fan-out of two. 
Figure 3.4 is the transient analysis of the same inverter which shows the switching speed. 
The pull-up and pull-down times, measured between 10% and 90% of the logic high level, 
are 0.29 ns and 0. 16 ns respectively. The maximum frequency of operation of this inverter 
is 1.2 GHz with a fan-out of two. The average consumed power by the inverter at the 
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maximum frequency of operation is 2.17 mW from a 2.0 V power supply. The maximum 
operating frequency decreases as the loading increases because increasing the load will 
increase the output capacitance and consequently the pull-up and pull-down times. Also, 
the speed of the circuit and the power consumed are dependent on the power supply 
voltage. Increasing the supply voltage will increase the maximum frequency of operation 
but will also increase the consumed power (trade off). As the supply voltage exceed 2.0 
volts, the consumed power increases dramatically due to the drain-to-source leakage 
current which creates a current flow from V£)d to ground. Also, increasing the input signal 
transition over 1.75 volts will increase the consumed power dramatically due to the gate 
leakage current, which is explained in detail in Chapter II. Therefore, the supply voltage of 
all designs in this chapter and following chapters will be limited to 2.0 volts and the input 
gate transition limited to 1.75 volts. 
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2. Static NAND and NOR Gates Circuit Design 

CGaAs Static NAND gate and NOR gate schematics are shown in Figure 3.5. The gate 
length of all transistors is 0.7 |im. The transistor gate widths in microns are indicated on the 
diagram. The given transistor sizes yield the best results for noise margins, drive capability 
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and speed. The DC transfer curves of both the NAND gate and the NOR gate, obtained 
from the HSPICE simulation, are shown in Figures 3.6 and 3.7, respectively. During 
simulation, each gate input was preceded by a pulse shaping circuit consisting of two 
cascaded inverters. The NAND gate DC transfer curve is obtained when the gate is 
powered with 2.0 volts. One input of the gate is tied to and the other input switches 

between 0.0 and 1.75 volts. The NOR gate DC transfer curve is obtained in the same way 
except the non ramped input is tied to ground to propagate the effect of the ramped input 
to the gate output. From these figures, the noise margin low and noise margin high of both 
the NAND gate and the NOR gate are the same and are equal to 0.8 volts. The transient 
analysis for both gates are also plotted in Figures 3.8 and 3.9, respectively, as an output of 
a HSPICE simulation. The circuit was powered from a 2.0 V power supply and input 
signals switch between 0.0 and 1.75 volts with a fan-out of two. 
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Figure 3.5: CGaAs Static NAND and NOR Gates 

The power supply voltage of the NAND gate was changed from 2.0 V to 1 .0 V in 0.25 
V steps to study the effect of changing the power supply on the maximum operating 
frequency. Figure 3.10 shows the average consumed power by the NAND gate as a 
function of frequency at different power supply voltages. The maximum operating 
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frequency of the NAND gate is 1 .22 GHz at a supply voltage of 2,0 V. It consumes 5.8 mW 
at this maximum frequency. Decreasing the power supply voltage will reduce both the 
maximum frequency of operation and the consumed power. Therefore, when working at a 
low frequency, the circuit can be operated from a lower power supply voltage to decrease 
the power consumed. The effect of increasing the output load on the maximum operating 
frequency of the NAND gate is also plotted in Figure 3.1 1. It is clear from this figure that 
the maximum frequency of operation decreases when the load increases because of the 
increased output capacitance. 

For a CGaAs static NOR gate, the maximum frequency of operation is less than for a 
NAND gate because it contains two slow PFETs in series. In the NOR gate circuit, the 
PFET gate widths are increased to be 10 [Xm (compared to 5 jXm for the NAND gate PFETs) 
to compromise for the above effect. The maximum frequency of operation of this gate is 
0.82 GHz with a power supply of 2.0 V and the input signal switches between 0.0 V and 
1.75 V. The average power consumed by the gate at the maximum operating frequency is 
6.5 mW. It is worth mentioning that loading and power supply voltage effects on NOR gate 
performance is the same as for a NAND gate. 
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Figure 3.7: DC Transfer Curve of CGaAs Static NOR Gate 




Figure 3.8; Transient Analysis of CGaAs Static HAND Gate 
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Figure 3.9: Transient Analysis of CGaAs Static NOR Gate 




Figure 3.10: Power Consumption of CGaAs Static NAND Gate 
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Figure 3.11: Loading Effects on CGaAs Static NAND Gate 

3. Static XOR Gate Circuit Design 

Two different implementations of CGaAs Static XOR gates have been designed and 
simulated using HSPICE to measure gate propagation delay, maximum frequency of 
operation and consumed power. The better design was then selected to be used as a building 
block in later circuits discussed in the following chapters. The simulation is carried out 
using a supply voltage of 2.0 V. The input signals were switched between 0.0 V and 1.75 
V and a load of two inverters was used. 

The first XOR gate consists of 6 transistors and the schematic is shown in Figure 3.12. 
Transistor gate widths in microns are written on the diagram, while all transistor gate 
lengths are 0.7 |lm. The maximum operating frequency of this gate is 0.55 GHz with an 
average consumed power of 3.2 mW at that frequency. The second design consists of 8 
transistors with the schematic shown in Figure 3.13. Maximum operating frequency of this 
gate is 0.7 GHz, which is higher than the maximum frequency of the first design. The 
average consumed power of this gate at the maximum frequency is 6.2 mW. 
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In a comparison between the two designs, the first gate has an advantage of fewer 
transistors and lower average power consumption compared to the second design. The first 
design has a major disadvantage of having asymmetrical propagation delay. I.E., the 
propagation delay from the A input to the output is not the same as from the B input to the 
output. Also, the propagation delay from the A input to the output depends on the B input 
logic level. The 8-transistor XOR gate has a higher maximum frequency of operation with 
a symmetrical propagation delay. Therefore, the 8-transistor XOR gate will be used as a 
building block in the circuits in the next chapters. A HSPICE simulation was performed for 
these two gates at different transistor gate widths to discover the optimum widths. Also, the 
simulation was conducted at different power supply voltages to study the effect of the 
power supply on the performance of the gates. This resulted in the same conclusion as for 
the NAND and NOR gate simulations obtained earlier. 



Vdd 




45 



'dd 




4. Static XNOR Gate Circuit Design 

Following the same procedures conducted for XOR gate design in the previous 
subsection, two different XNOR gate circuits were simulated using HSPICE. The optimum 
design from the point of view of maximum frequency of operation, layout area and power 
consumption is selected for use in the circuits of the next chapter. 

The schematic of the first design is shown in Figure 3.14 and contains 10 transistors. 
All transistor gate widths are 10 pm and transistor gate lengths are 0.7 pm. The maximum 
operating frequency of this gate is 0.7 GHz with an average power consumption of 1 1 mW 
at that frequency. The schematic of the second design shown in Figure 3.15 consists of 8 
transistors. Transistor gate widths are in microns and are written on each transistor and gate 
lengths are 0.7 pm. The maximum frequency of operation of this gate is the same as that of 
the 8-transistor XOR gate and is 0.7 GHz. The average power consumed is 6.2 mW. 

Comparing the two designs, the second design outperforms the first one because the 
layout area is smaller. Also, the average power consumption is less than that of the first one. 
Therefore, the 8-transistor XNOR gate will be used in the later chapters as a building block 
in the CGaAs Static circuits. 
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Figure 3.15: CGaAs Static Eight-Transistor XNOR Gate 



5. CGaAs Ring Oscillator Design Using Static NOR Gates 
The ring oscillator is an asynchronous test structure for propagation delay 
measurements. It is the easiest way to accurately measure the logic gate propagation delay. 
The ring oscillator in Figure 3.16 consists of an odd number (n) of inverting logic gates 
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connected in a closed loop. When the power is applied, the circuit will start to oscillate at 
a frequency f = l/(2ntp£j), where n is the number of gates and tp^ is the average gate 
propagation delay. The propagation delay per gate can be found from a single 
measurement. A large number of gates should be used in this measurement to decrease the 
error in measuring the propagation delay. Increasing the number of gates will also reduce 
the measurement frequency and make it suitable for measure by conventional test 
equipment. 

The ring oscillator designed here is shown in Figure 3.16 and it consists of eleven NOR 
gates. The output frequency is 149 MHz. It consumes an average power of 21.7 mW when 
powered from a 2.0 V power supply. All transistor-gate widths of the circuit are 10 pm and 
lengths are 0.7 pm. The input and output waveforms for this oscillator are shown in Figure 
3.17. The first waveform is the applied reset pulse to start the oscillation and the second 
waveform is the output of the circuit. From the measured frequency, the propagation delay 
of one NOR gate is calculated to be 0.305 ns. 
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Figure 3.17: Output Waveforms of CGaAs Eleven NOR Gate Ring Oscillator 



B. CGaAs TPDL CIRCUIT DESIGN 

CGaAs Two-Phase Dynamic Logic (TPDL) circuits use only the fast N-channel 
transistors in evaluating a logic function. P-channel transistors are used only for 
precharging the output nodes. Also, there is no direct current path from supply voltage to 
ground at any time, which eliminates static and short-circuit power consumption in this 
logic family. The evaluating transistor block (NFETs) can be designed using minimum size 
transistors (in most cases) without affecting the noise margins or the speed. This will reduce 
both the layout area and the consumed power. Increasing the sizes of the evaluating 
transistor block will increase the drive capability of the circuit, but will increase the output 
capacitance which decreases the maximum operating frequency (trade off). Decreasing the 
power supply voltage will decrease both the power consumption and the power-delay 
product, but will also decrease the noise margin (trade off). As the number of transistors in 
the evaluating transistor block increases, the size of the clocked PFET needs to be increased 
to be able to quickly charge the output node. 

The basic circuit topology of this family is shown in Figure 3.18. The circuits consist 
of two main blocks, a (j)i block and a 4>2 block. Each block consists of pass gates, a clocked 
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precharge PFET, a clocked discharge NFET and an N-transistor logic block. The output of 
a (j)i stage can not be connected to the input of another (t>i stage or fed back to itself. The 
same condition applies to a (J >2 stage. 

The detailed operation of the circuit is as follows. When 4>j is low and ^2 is high, the 
(j)l stage is precharging its output node N1 through the ON transistor Ql. At the same time, 
the input data is passed to the N-transistor logic block through the ON (J)i pass gates. The 
precharged output of this stage is isolated from the inputs of the next ({>2 stage by the off 4)2 
pass gate. The 4>2 stage is evaluating the data stored on the N-transistor logic block inputs 
(4>2 is high and Q3 is OFF while Q4 is ON). The evaluated output is passed to the next 4>i 
stage through the ON 4>i pass gate. When 4>i is high and 4>2 is low, the 4>i stage is evaluating 
the data stored on the input of the N-transistor logic block. The output (node Nl) is fed to 
the next 4>2 stage through the ON 4>2 pass gate. At the same time, the 4>2 stage is precharging 
the output which is isolated from the next 4>i stage by the off 4>i pass gate. When both 4>i 
and 4>2 are high at the same time (they are non overlapped in the logic low), both stages (4>i 
and 4>2) will be evaluating the inputs stored on the N-transistor logic block. The outputs of 
both stages will be isolated from the next stage by the off pass gates so there is no 
corruption of data. The two phases must be non overlapped in the low state to prevent data 
corruption. If this condition is not satisfied and both phases are low at the same time, both 
stages will precharge their output nodes and the precharged outputs will be passed to the 
inputs of the next stage. When either 4>i or 4>2 switches to logic low, the corresponding stage 
will start to evaluate the erroneous inputs (pre-charged outputs of the previous stages) and 
give an erroneous output. If any input to a 4>i stage is supplied from another circuit (non- 
TPDL circuit), it has to be stable (unchanging) during 4>i logic low. Similarly, if any input 
to a 4>2 stage is supplied from a non-TPDL circuit, this input has to be stable during 4>2 logic 
low. Another condition that must be satisfied is that 4>j stage outputs can only be connected 
to 4>2 stage inputs and 4>2 stage outputs can only be connected to 4>i stage inputs. This is 
similar to Si CMOS “zipper” logic [36]. 
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An inverter, NAND gate and NOR gate are implemented here because they are the 
main building blocks of any logic family. Figure 3.19 shows the N-transistor logic block 
representation of these gates. The following sections show the simulation results of these 
TPDL gates as well as for TPDL XOR and XNOR gates using HSPICE circuit simulation 
tools. HSPICE simulation results outline that the TPDL logic family is superior to the static 
logic family, as will be explained in the comparison section later in this chapter. 



Vdd 




(J)l 



<t>2 

Figure 3.18: Basic Circuit Topology of CGaAs TPDL Gate 
1. TPDL Inverter Circuit Design 

The TPDL inverter circuit shown in Figure 3.19 has been inserted in the test circuit 
shown in Figure 3.20. The test circuit was then simulated using HSPICE to test the TPDL 
inverter performance. HSPICE simulation results show that this TPDL inverter has a 
maximum frequency of 2.38 GHz with a fan-out of two when powered from a supply 
voltage of 2.0 V. The input signal switches between 0.0 V and 1.75 V. The average power 
consumed by the inverter at the maximum frequency is 1.7 mW. Input and output 
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waveforms for the inverter operating at a frequency of 2.38 GHz are shown in Figure 3.21 
The first waveform is the input signal applied to the <J>i section of the circuit. This signal 
must be stable during <|)i low because during this time the input signal is passed through the 
4>i pass gate to the (j)i section evaluating block. Therefore, for the evaluation to be correct, 
this signal has to be stable. The second waveform is the clocking signal, while the third 
waveform is the inverter output. The output of this inverter can only be applied to a (J )2 
section input because it is an ouq>ut of a 4>i section. The inverter output precharges when 
4>j is low and evaluates the input during 4>i high. Thus it can be sampled at the end of (J)] 
evaluation phase (at the end of (|)i high period). 
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Figure 3.19; CGaAs TPDL Combinational Logic Gates 
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Figure 3.20; CGaAs TPDL Inverter Test Circuit 
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Figure 3.21: Input-Output Waveforms of CGaAs TPDL Inverter 



2. TPDL NAND Gate Circuit Design 

The TPDL NAND gate shown in Figure 3.19 was inserted in a test circuit to study its 
performance for the comparison of this logic family with static logic. The circuit shown in 
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Figure 3.22 is the complete test circuit simulated using HSPICE to test the performance of 
the TPDL NAKD gate. The maximum frequency of operation achieved by this circuit is 
2.38 GHz when powered from a supply voltage of 2.0 V. The input switches between 0.0 
V and 1.75 V. The average power consumed by this TPDL NAND gate at the maximum 
frequency is 1.98 mW. The input and output waveforms of the NAND gate are shown in 
Figure 3.23. The first waveform is the A input of the NAND gate while the B input is held 
at logic high, the second waveform is the <j)i clocking signal, while the last waveform is the 
gate output. As shown in Figure 3.23, the input signal is applied to the (|)i section and is 
stable (unchanged) during 4)j low to insure correct operation of the circuit. 
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Figure 3.22: CGaAs TPDL NAND Gate Test Circuit 
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Figure 3.23: Transient Analysis of CGaAs TPDL NAND Gate 

The effect of changing the power supply voltage on the NAND gate average power 
consumption and maximum frequency of operation was also studied. Figure 3.24 shows the 
average power consumption as a function of the operating frequency for different supply 
voltages and input transitions. The consumed power is linearly proportional to the 
frequency of operation. Also, the consumed power increases with the increase in the supply 
voltage due to the increase in the drain-source leakage current and the gate conduction 
current. The effect of increasing the output load on the maximum operating frequency of 
the NAND gate, powered from a 2.0 V power supply, is also studied and plotted in Figure 
3.25. As the load increases, the maximum frequency of operation decreases. The decrease 
in the maximum frequency is not very much (compared with that of the static logic family) 
because the load is separated from the driving circuit by the pass gate. The effect of loading 
on the maximum frequency is due to the charge redistribution problem which ia a common 
problem in all dynamic logic families. 
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Figure 3.24: CGaAs TPDL NAND Gate Power Consumption 




Figure 3.25: Loading Effects on CGaAs TPDL NAND Gate 
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3. TPDL NOR Gate Circuit Design 

The TPDL NOR gate shown in Figure 3.19 has been simulated using HSPICE 
simulation tools. The test circuit u^ed to accomplish this is shown in Figure 3.26. Maximum 
operating frequency of this logic gate is 2.38 GHz (same as NAND gate) when powered 
from a 2.0 V power supply, and the input signal switches between 0.0 V and 1.75 V. The 
average power consumed by the gate at the maximum frequency is 2.01 mW. When the 
supply voltage decreases to 1.75 V, the same maximum frequency can be approached but 
the average consumed power drops to 1.04 mW. This drop is due to the decrease in both 
the drain-source leakage current and the gate conduction current. Input and output 
waveforms of the circuit, when simulated in HSPICE, are shown in Figure 3.27. The first 
waveform is the input applied to one input of the gate while the other input is held at 0.0 V. 
The second waveform is the clocking signal and the last waveform is the logic gate 
output. As mentioned before, the input signals are allowed to switch between the logic 
levels only when the corresponding (j) clock is logic high. 




Figure 3.26: CGaAs TPDL NOR Gate Test Circuit 
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Figure 3.27: Input-Output Waveform of CGaAs TPDL NOR Gate 



4. TPDL XOR Gate Circuit Design 

A useful complex logic gate in many combinational circuits is the XOR gate. A TPDL 
CGaAs XOR gate has been designed using three TPDL NAND gates and one static NAND 
gate as shown in Figure 3.28. XI is the static NAND gate, while X2, X3 and X4 are the 
TPDL NAND gates. The circuit was simulated using the HSPICE simulation tool. The 
input and output waveforms at the maximum operating frequency with a fan-out of two are 
shown in Figure 3.29. Maximum operating frequency of this logic gate, according to 
HSPICE simulations, is 1.61 GHz when powered from a 2.0 V power supply and with an 
input signal transition between 0.0 V and 1.75 V. The average power consumed by the 
TPDL XOR gate at the maximum operating frequency is 4.8 mW when one of the inputs 
is switching and the other input is tied to 1.75 V. When one input of the gate is switching 
and the other input is tied to 0.0 V, the average consumed power drops to 3.98 mW. The 
TPDL XOR gate maximum frequency of operation is higher than that of the static gate. The 
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average consumed power is lower than that for the static gate and will be explained in detail 
in the comparison between TPDL and static logic, Section D. 




Figure 3.28: CGaAs TPDL XOR Gate 
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Figure 3.29; Input-Output Waveforms of CGaAs TPDL XOR Gate 
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5. TPDL XNOR Gate Circuit Design 

The XNOR gate is also a useful complex combinational logic gate. A TPDL CGaAs 
XNOR gate has been designed using three TPDL NOR gates and one static NOR gate. The 
TPDL XNOR gate is similar to the circuit shown in Figure 3.28 but all NAND gales are 
replaced by NOR gates. This circuit was simulated using HSPICE simulation tools. The 
input and output waveforms of the simulated circuit are shown in Figure 3.30. Maximum 
frequency of operation of the circuit is 1.91 GHz at a power supply voltage of 2.0 V. The 
input signal switches between 0.0 V and 1.75 V with a load of two inverters (fan-out of 
two). The average power consumption of the gate at the maximum operating frequency is 
7.61 mW when one input is switching and the other input is tied to 0.0 V. When the other 
input is tied to 1.75 V, the average consumed power drops to 6.27 mW. 
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C. DESIGN OF CGaAs TWO PHASE NON-OVERLAPPING CLOCK 



GENERATOR 

The operation of TPDL circuits requires two clock phases and their complements. This 
motivates the design of such clock generator. The latest design of a two-phase clock 
generator was found in reference 43. It has a maximum frequency of operation of 0.5 GHz. 
The average power consumed by this generator is 52 mW from a 1.5 V power supply. This 
clock generator generates only the two non-overlapped clocks without their complements, 
which is half of the described design as will be seen later in this section. The design in 
reference 43 was fabricated using a N-MESFET process which is faster than the CGaAs 
process but consumes much higher power. In this section, design, simulation and 
performance tests of the circuit that generates two non-overlapping clocks and their 
complements are explained. 

The two-phase non-overlapping clock generator consists of two main parts. The first 
part generates the two-phases (j)i and (|)2 from the input clock signal. The output of the first 
part is fed to the second part which generates the complements. The key to the design is 
that the two generated clock phases need to be non overlapped in the logic low level (a 
requirement of the TPDL circuits). Also, the complements of the two-phases need to be 
non-overlapped in the logic high level. This implies that each phase and its complement 
need to be 180 degrees out of phase. As the frequency increases, the clock period decreases, 
which make the phase error crucial. The logic diagram of the generator circuit is shown in 
Figure 3.31. The circuit diagram of the generator is shown in Figure 3.33. Transistor gate 
widths, in microns, are indicated in Figure 3.33, while transistor-gate lengths are 0.7 
p,m.The transistor-gate widths of the pass-gate transistors in this figure are 2 |0.m for both 
NFETs and PFETs. 
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Figure 3.31: Logic Diagram of Two- Phase Clock Generator 

The generator circuit was designed, then simulated using HSPICE simulation tools and 
finally implemented using CADENCE tools to be fabricated in the Motorola fabrication 

process line. Its gate layout area is 77.7 |im^. When the circuit is powered from 2.0 V and 
the clock input switches between 0.0 V and 1.75 V, the maximum operating frequency of 
the circuit is 1 .0 GHz for a fan-out of two. The average power consumed by the circuit at 
the maximum operating frequency is 25.8 mW. Figure 3.32 shows the input and output 
waveforms of the clock generator at 1.0 GHz. The generated clock phases, (t)i and (j) 2 , are 
non-overlapped in the logic low level. The effect of increasing the load on the maximum 
operating frequency of the circuit is also studied and plotted in Figure 3.34. As seen from 
this figure, maximum frequency of operation of the circuit decreases with increasing load. 
To drive a high capacitive load, a driver circuit for each phase is required to reduce the 
effect of the load on the maximum frequency of operation. Also, as the frequency increases, 
the power consumption of the circuit increases due to the increase in the dynamic power 
consumption which is frequency dependent. The average power consumed by the circuit, 
as a function of frequency, is plotted in Figure 3.35. The power dependence on the 
frequency is linear. 
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Figure 3.32: Input-Output Waveforms of Clock Generator 
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Figure 3.33; Circuit Diagram of Two-Phase Clock Generator 



64 




Average Consumed Power [mW] 




Figure 3.34: Loading Effects on Clock Generator 




Figure 3.35; Power Consumption of Clock Generator 
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D. COMPARISON BETWEEN CGaAs STATIC AND TPDL COMBINATIONAL 



LOGIC CIRCUIT DESIGNS 

In Sections A and B, the design of both CGaAs Static and TPDL logic gates are 
explained in detail. The design of TPDL circuits is very complicated, but it has many 
advantages over the static design. The maximum frequency of operation is higher, it 
consumes lower power and the layout area is less than that of the static design. Its main 
disadvantage is the need for two non-overlapped clock phases and their complements for 
operation. Also, it has a more complex logic and circuit design. The two clock phases have 
to be non-overlapped in the logic low level to prevent data corruption, as explained in 
Section C. Table 3.1 shows the comparison between static logic designs and TPDL designs 
[49]. As seen from the table, TPDL circuits have a maximum frequency of operation about 
double that of the static circuits. Also, the consumed power at the maximum frequency is 
always less than that of the static circuits. 

For the power consumption comparison to be fair, two factors must be considered. 
First, the average consumed power of both TPDL and static circuits must be calculated at 
the same frequency. For example, from Table 3.1, the static NAND gate maximum 
frequency of operation is 1 .2 GHz and the power consumption at that frequency is 5.8 mW, 
while the power consumed by the TPDL NAND gate at 2.38 GHz is calculated to be 1.98 
mW. The fair comparison for the average consumed power, when both powers are 
calculated at 1.2 GHz, are 5.8 mW for the static gate and 1.4 mW for the TPDL gate when 
both gates are connected to the same power supply. However, the TPDL circuit will 
function properly at 1.2 GHz when powered from a 1.0 V power supply with much lower 
power consumption. The average consumed power is then reduced to 0.15 mW. Therefore, 
at 1.2 GHz, the comparison between static and TPDL average consumed power will be 5.8 
mW to 0.15 mW, which is over thirty-eight times. Second, for the comparison to be fair, 
the layout area and the power consumed by the two-phase non-overlapping clock generator 

must be considered. The gate layout area of the clock generator is 77 )lm^ and it consumes 
25 mW at its maximum operating frequency. If the clock generator explained in previous 
section drives a hundred combinational logic gates, its power consumption and the layout 
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area should be distributed over these driven gates. Dividing the layout area and the power 
consumption of the clock generator over the driven gates will increase theirs by less than 
10%. Thus, this factor has a small effect on the comparison. 

For the comparison to be clear, the NAND gate will be taken as a study case. The 
power consumption versus the operating frequency at different power supply voltages for 
both static and TPDL designs is shown in Figure 3.36. This figure actually combines the 
two graphs for both static and TPDL NAND gates plotted previously. The power delay 
product for the TPDL and static logic NAND gates is also plotted in Figure 3.37. The power 
delay product decreases as the power supply decrease because of the decrease in the 
leakage current. Also, loading effects on both designs are shown in Figure 3.38. This figure 
shows that the loading effect has less influence on the TPDL gate maximum frequency of 
operation than for the static gate. This is because the load of the TPDL gate is isolated from 
the output node by the pass gate. Loading effects on the static design are due to the 
increased output capacitance, while loading effects on the TPDL design are due to the 
charge redistribution problem, which is a common problem for all dynamic logic circuits. 

This concludes the combinational logic circuit design using both TPDL and static 
logic. In the next chapter, the design and implementation of the sequential logic circuits is 
presented. Also, the performance comparison between the two techniques is discussed. 



Table 3.1: CGaAs Static and TPDL Combinational Circuit Performance 



Designed 
Logic Gate 


Circuit 

Topology 




Circuit performance 


Maximum 

Frequency 

[GHz] 


Average 
Power @ 
Fniax [mW] 


Number of 
Transistors 


Total 
Layout 
Area [|Im^] 


Inverter 


Static 


1.2 


2.17 


2 


14 


TPDL 


2.38 


1.7 


3 


7.7 


NAND 


Static 


1.2 


5.8 


4 


28 


TPDL 


2.38 


1.98 


4 


10.5 



67 




Table 3.1: CGaAs Static and TPDL Combinational Circuit Performance 
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Figure 3.36: CGaAs Static and TPDL NAND Gate Power Consumption 
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Figure 3.37: CGaAs Static and TPDL NAND Gate Power-Delay Product 




Figure 3.38: Loading Effects on CGaAs Static and TPDL NAND Gates 
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IV. DESIGN AND ANALYSIS OF COMPLEMENTARY GaAs 
STATIC AND DYNAMIC SEQUENTIAL CIRCUITS 



A sequential circuit is a circuit where the output is a function of either the previous 
inputs (state) or the previous inputs (state) and the current inputs. There are two major 
classes of sequential circuits, pulse-mode sequential circuits and fundamental-mode 
sequential circuits. Pulse-mode sequential circuits are often called synchronous circuits 
because their action is synchronized with the pulse input. Fimdamental-mode sequential 
circuits are often called asynchronous circuits. To guarantee proper operation of a 
fundamental-mode sequential circuit, only one input is allowed to change at any given time. 
Also, an input can only change when the circuit is internally stable. Fundamental-mode 
analysis is more complex than pulse-mode analysis because it requires tracking all changes 
of internally stored signals. Also, the necessity of avoiding critical races in fundamental- 
mode circuits makes the process of assigning internal variables to internal states rather 
complex. Moreover, the initial determination of the flow table and the process of 
minimizing the number of internal states requires care. 

The purpose of this chapter is to design different pulse-mode sequential logic functions 
using Complementary GaAs (CGaAs) static and TPDL circuit topologies, then compare 
between their performances. The functions designed in this chapter are; D-latch, D flip 
flop, divide-by-two circuit, 3-Bit Linear Feedback Shift Register (3BLFSR), and 4BLFSR. 
Section A discusses the design of these functions using static circuits, while in Section B, 
the same functions are designed using TPDL. In Section C, the comparison between static 
and TPDL circuit performance is established. 

A. CGaAs STATIC SEQUENTIAL CIRCUITS 

1. D-Latch Circuit 

A CGaAs D-latch static circuit is shown in Figure 4.1. It consists of ten transistors 
(three inverters and two pass gates). The circuit has been designed, simulated using 
HSPICE and then analyzed and optimized in layout area to achieve the highest frequency 
of operation. All transistor gate lengths used in this design are 0.7 |lm, while the gate widths 
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are 10 |lm for the inverter transistors and 5 jlm for pass-gate transistors (both NFETs and 
PFETs). Pass gates do not regenerate the input logic levels and increasing the transistor gate 
widths will not increase the speed of the circuit. The circuit speed will decrease because of 
the increased loading on the previous stage. Maximum operating frequency of the designed 
circuit is limited by the propagation delay of the signal through the entire circuit. The 
circuit was simulated with a power supply of 2.0 volts and the input signal transitions 
between 0.0 volts and 1.75 volts. Maximum clock frequency of the gate is 0.82 GHZ and 
the average consumed power at this frequency is 12.32 mW when loaded by two inverters 
(fan-out of two). D input and clock signals applied to the circuit have a rise and fall time of 
0.01ns. The time required to pull up the output to a logic high level measured from 50% of 
the clock pulse rising edge to 50% of the output signal rising edge (Tpjjj) is 0.75 ns. The 
time required to pull down the output to a logic low level measured from 50% of the clock 
pulse rising edge to 50% of the output signal falling edge (Tpjj) is 0.45 ns. Input and output 
waveforms of the D-Latch circuit at a frequency of 0.82 GHz are shown in Figure 4.2. The 
top waveform is the clock pulse applied to the circuit while the second waveform is the D- 
input and the bottom waveform is the circuit output. 




Figure 4.1; CGaAs Static D-Latch Circuit 
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Figure 4.2; Input-Output Waveform of CGaAs Static D-Latch 
2. D Flip Flop (D-FF) Circuit Design 

A negative edge-triggered D-FF is shown in Figure 4.3 which contains 20 transistors. 
Actually, this design is a master-slave flip flop. It consists of two D-latch gates, with Q 
output of the first latch connected to the D input of the second latch. Transistor sizes are the 
same as for the D-latch circuit (designed in the previous subsection). Maximum operating 
frequency of this circuit is 0.82 GHz with a power supply voltage of 2.0 volts. Input signals 
switch between 0.0 volts and 1.75 volts and the circuit has a fan-out of two. The average 
power consumed by the circuit at the maximum operating frequency is 20.8 mW. The pull- 
up time of the flip flop is 0.83 ns, measured from 50% of the clock-falling edge (after the 
input changes to a logic level high) to 50% of the output rising edge. Pull-down time is 0.58 
ns, measured from 50% of the clock-falling edge (after the input changes to a logic level 
low) to 50% of the output-falling edge. Input and output waveforms of the D-FF at a clock 
frequency of 0.82 GHZ are shown in Figure 4.4. 
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Figure 4.4; Input-Output Waveforms of CGaAs Static D Flip Flop 
3. Divide-By-Two Circuit Using D Flip Flops 

The CGaAs divide-by-two static circuit design is based on the D flip flop circuit 
described in the previous subsection. When the Q output of the D flip flop is fed back to 
the D input, the frequency of the output will be the input clock frequency divided by two. 
The logic diagram for this divider is shown in Figure 4.5. The circuit was designed, then 
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simulated using HSPICE to measure the performance. Maximum operating frequency of 
this divider is 0.82 GHz, same as the maximum frequency of the D-FF, when powered from 
a 2.0 V power supply. The input clock signal transitions between 0.0 V and 1.75 V with a 
load of two inverters (fan-out of two). Average power consumed by the divider circuit at 
the maximum frequency of operation is 22.5 mW. Input and output waveforms of the 
circuit operating at maximum frequency (0.82 GHz) are plotted in Figure 4.6. The top 
waveform in this figure is the input clock, while the bottom waveform is the circuit output. 
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Figure 4.5; CGaAs Static Divide-By-Two Circuit 






Figure 4.6: Input-Output Waveform of CGaAs Static Divide-By-Two Circuit 
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4. Linear Feedback Shift Registers (LFSRs) 

A LFSR is a logic network constructed from the following basic components; unit 
delay or D Flip Flop, modulo-2 adder, or modulo-2 scalar multiplier. Such a circuit is 
considered to be linear because it preserve the principle of superposition. Its response to a 
linear combination of stimuli is the linear combination of the responses to the individual 
stimuli. LFSR circuits are used extensively as sources of pseudorandom binary test 
sequences. These sequences have many properties similar to random sequences but they are 
periodic and deterministic, thus they are pseudorandom instead of random. LFSR circuits 
are autonomous, they have no inputs except for a clock. Also, they are cyclic in the sense 
that when clocked repeatedly, they go through a fixed sequence of states. The maximum 

number of states that an n-stages LFSR circuit can generate is 2“ - 1. The LFSR that 
generates the maximum number of states is called a maximum length shift register. If a 
LFSR generates a cyclic state sequence of length k, then the output sequence repeats itself 
every k clock cycles. 

The LFSRs designed here are of maximum length and will be used to generate a test 
sequence for testing the performance of circuits described in later chapters. The motivation 
for using LFSRs is to reduce the number of input/output terminals for the designed circuits. 
This number needs to be reduced because of the limited number of high frequency test 
probes that can be used simultaneously and because of the difficulty in generating off-chip, 
multi-bit test vectors at high speed. A 3-Bit LFSR will be used in the circuits that need three 
inputs, while a 4-Bit LFSR wiU be used in the circuits that need four inputs. During testing, 
all circuits will have only one input (clock) and the LFSR will internally generate the test 
vector required for testing the circuit. 

a. Three-BU LFSR 

A Three bit LFSR has been deigned to generate a maximum-length test vector 
for the circuits having three inputs. The circuit schematic is shown in Figure 4.7. The circuit 
consists of three D Flip Flops and one 8-transistor XOR gate. The LFSR will generate the 
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sequence listed in Table 4. 1 (all states except for state (XX)). If the state 000 is reached, the 
circuit will stay in this state forever (the layout contains a triggering input to force the 
circuit to start in a state other than 000). The 3-Bit LFSR circuit was designed and simulated 
using HSPICE. The power supply is 2.0 volts and the input clock switches between 0.0 
volts and 1 .75 volts. When the circuit output is loaded by two static inverters, the maximum 
clock frequency is 0.55 GHz. The maximum speed of the circuit is limited by the maximum 
speed of the static XOR gate (0.55 GHz). Average power consumed by the circuit at the 
maximum frequency is 40 mW. The generated output sequence is shown in Figure 4.8 and 
is similar to the sequence listed in Table 4.1. 



Table 4.1: 3-Bit LFSR Generated Sequence 
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Figure 4.7: CGaAs Static 3-Bit LFSR Circuit 




Figure 4.8; Input-Output Waveforms of CGaAs Static 3-Bit LFSR 
b. Four -Bit LFSR 

A four-bit LFSR of maximum length was also designed. The schematic diagram 
is shown in Figure 4.9. The circuit consists of four D Flip Flops and one 8-transistor XOR 
gate. The four-stage circuit generates all the states except the state 000. The generated 
sequence for the four outputs is listed in Table 4.2 (state (DOOO is un-reachable). When using 
a XNOR gate instead of a XOR gate, the un-reachable state will be 1111 instead of 0000. 
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The circuit has been simulated using HSPICE simulation tools. The power supply voltage 
is 2.0 volts and the clock input switched between 0.0 volts and 1.75 volts. Maximum 
operating frequency of the circuit is 0.55 GHz when the output is loaded by two static 
inverters. Average power consumed by the circuit at the maximum frequency is 48.2 mW. 
The generated output sequence of the circuit is shown in Figure 4. 10. The circuit was also 
simulated at different operating frequencies to measure the average consumed power as a 
function of frequency. The increase in the average power consumed by the circuit is 
approximately linearly proportional to the increase in the operating frequency, as shown in 
Figure 4.11. 



Table 4.2: 4-Bit LFSR Generated Sequence 
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Figure 4.9: CGaAs Static 4-Bit LFSR Circuit 
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Figure 4.10: Input-Output Waveforms of CGaAs Static 4-Bit LFSR 
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Figure 4.11: CGaAs Static 4-Bit LFSR Power Consumption 

B. CGaAs TPDL SEQUENTIAL CIRCUITS 

The CGaAs TPDL sequential circuits designed here will be used as building blocks in 
the circuits of the next chapters. The designed circuits are the D Rip Flop, 3-Bit LFSR and 
4-Bit LFSR. The operation of these circuits requires two non-overlapped clock phases and 
their complements. The clock phases (|)i and (f »2 have to be non overlapped in the logic low 
level, as explained in Section B, Chapter III. These clock phases are generated using the 
clock generator designed in Chapter HI, Section C. 

1. D Flip Flop (DFF) Circuit 

A CGaAs TPDL D Flip Flop circuit diagram is shown in Figure 4.12. The circuit 
consists of two pass gates and two TPDL inverters, a total of 10 transistors. The transistor 
gate lengths are 0.7 jim while the transistor gate widths are written on each transistor in the 
diagram. The D Flip Flop circuits have been designed and simulated using the HSPICE 
simulation tool. The circuit was powered from a 2.0 volt power supply. The D-input and 
four clock phases switch between 0.0 volts and 1.75 volts. The maximum operating 
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frequency of the circuit is 2.0 GHz when loaded by two-TPDL inverters. Average power 
consumption is 4.54 mW at the maximum operating frequency. Input-output waveforms 
are shown in Figure 4.13. The first waveform is the (j)] clock phase, the second waveform 
is the D-input, while the third is the flip flop Q output. The D-input is applied to the 4>i 
section of the circuit and should be stable (unchanged) during (j)^ logic low (as explained in 
detail in Chapter IH, Section B). The output of the circuit is taken from the 4>2 section, so it 
is precharged to V^j^j during 4>2 logic low ((j>| logic high) and evaluated during 4>2 logic high. 
This D Flip Flop will be used in the design of the TPDL Linear Feedback Shift Register in 
the next subsection. 



Vdd 




Figure 4.12: CGaAs TPDL D Flip Flop Circuit 
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Figure 4.13; Input-Output Waveform of CGaAs TPDL D Flip Flop 
2. Linear Feedback Shift Registers (LFSR) 

The Linear Feedback Shift Registers designed here will be of maximum length to be 
functionally identical to the static LFSRs designed in Section A of this chapter. 

a. Three-Bit LFSR 

The CGaAs TPDL 3-Bit LFSR designed here consists of two TPDL D Flip Flops 
(designed in the previous subsection) and one TPDL XOR gate (designed in Chapter HI, 
Section B). The outputs of the two D Flip Flops, Qq and Qj, are applied to the XOR-gate 
inputs. The XOR-gate output is fed to the Dq input. The TPDL XOR gate output is clocked, 

thus it can be used as a separate stage. The sequence generated by this LFSR is identical to 
that generated by the Static LFSR designed in Section A and listed inTable 4. 1 . The TPDL 
3-Bit LFSR circuit has been designed and simulated using the HSPICE simulation tool to 
test the circuit performance. Maximum operating frequency of the circuit is 1.2 GHz when 
powered from a 2.0 volt power supply. The input clock signal transitions between 0.0 volts 
and 1.75 volts and every Q output of the circuit is loaded by two TPDL inverters (fan-out 
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of two). Average power consumed by the circuit at the maximum operating frequency is 
10.5 mW. Qq, Q], and Q 2 outputs of the circuit are shown in Figure 4.14. All Q outputs are 
taken from the <j >2 sections of the circuit. Therefore, they precharge to when the <J )2 
phase is logic low and evaluate when it is logic high. 




Time [ns] 



Figure 4.14: Input-Output Waveform for CGaAs 3-Bit LFSR 
b. Four-Bit LFSR 

The CGaAs TPDL Four-Bit LFSR has been designed and simulated using 
HSPICE simulation tools. It consists of three D Flip Flops and one XOR gate. Maximum 
clock frequency of the circuit is 1.2 GHz when powered from a 2.0 volt power supply. The 
input clock signal switches between 0.0 volts and 1 .75 volts and all Q outputs are loaded 
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by two TPDL inverters (fan-out of two). The circuit consumes an average power of 15.89 
mW at the maximum frequency. The sequence generated by this LPSR circuit is identical 
to that generated by the Static 4-Bit LFSR and listed in Table 4.2. The four Q outputs are 
shown in Figure 4.15. All Q outputs precharge during (|)2 logic low and evaluate during <t >2 
logic high because they are outputs from <1)2 sections of the circuit. The generated sequence 
can also be read directly from Figure 4.15. Also, the consumed power dependence on the 
operating frequency of the circuit was studied and found to be approximately linear, as 
plotted in Figure 4.16. 




Time [ns] 

Figure 4.15: Input-Output Waveform for CGaAs 4-Bit LFSR 
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Figiore 4.16: CGaAs TPDL 4-Bit LFSR Power Consumption 

C. COMPARISON BETWEEN STATIC AND TPDL SEQUENTIAL CIRCUIT 
SIMULATION RESULTS 

In the previous two sections, design, analysis and simulation of CGaAs static and 
TPDL sequential circuits are explained in detail. These circuits are designed to be 
functionally identical for the purpose of comparing their performance. Table 4.3 
summarizes the performance of both the static and TPDL sequential circuits designed in 
this chapter [49] . 

The maximum frequency of operation is 0.55 GHz for the static 4-Bit LFSR and 1.2 
GHz for the TPDL circuit. Layout area (transistor-gate area) of the static 4-Bit LFSR is 490 

pm , while that of the TPDL circuit is 143 pm'^. Also, average power consumed by the 
static circuit is 48.2 mW, while the TPDL circuit consumes only 15.89 mW at the 
maximum frequency of operation. Therefore, TPDL circuits outperform the static design 
in maximum frequency of operation, average consumed power and total layout area. As 
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mentioned in Chapter HI, for the comparison to be fair, two factors must be considered. 
First, power consumed by the clock generator required for TPDL circuit operation must be 
considered, which increases the actual power consumption and layout area of the TPDL 
designs. But, the clock generator can drive many TPDL circuits and the consumed power 
will be divided over all the driven circuits. Therefore, the increase in the power 
consumption and layout area of the TPDL circuits due to the clock generator is minimal as 
explained in Chapter HI, Section D. Second, the comparison should be accomplished at the 
same operating frequency. Figure 4.17 shows the average power consumed by both static 
and TPDL 4-Bit LFSR circuits at different frequencies. The fair comparison can be read 
directly from this figure at any given frequency. 



liable 4.3; CGaAs Static and TPDL Sequential Circuit Performance 
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Figure 4.17: CGaAs TPDL and Static 4-Bit LFSR Power Consumption 
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V. DESIGN OF TWO-LEVEL LOGIC FUNCTIONS USING 
COMPLEMENTARY GaAs STATIC AND DYNAMIC LOGIC 

FAMILIES 

A two-level logic circuit is a circuit which can be divided into two separate consecutive 
logic blocks. A two-level logic function is a logic function which requires a two-level logic 
circuit for implementation. In Chapter III, the design of one-level functions (inverter, 
NAND gate and NOR gate) using Complementary GaAs (CGaAs) static and TPDL circuits 
were discussed in detail. In this chapter, four different logic functions are selected for 
simulation and implementation using static and dynamic logic families. Dynamic logic 
families discussed in this chapter are Domino logic, N-P Domino logic and TPDL. These 
functions were selected because they are representative of typical two-level logic functions. 
The selected functions are as follows: 



= aA + B) + C) 


(5.1) 


= ((^•5)*C) 


(5.2) 


= i(A+B)*C) 


(5.3) 


= iiA*B) + C) 


(5.4) 



Section A discusses the design and simulation of the selected functions using static 
logic circuits. The design and simulation of the same logic functions using Domino logic 
is explained in Section B. Section C discusses N-P Domino circuits for implementing the 
above four logic functions. TPDL circuits generating the same logic functions are discussed 
in Section D. The comparison between the performance of all the above logic families is 
explained in Section E. The effect of increasing the load on the maximum operating 
frequency of all these logic families is discussed in Section F. In Section G, the effect of 
changing the power supply voltage on the performance of all logic families is explained. In 
all the above sections, the circuits designed are tested exhaustively to ensure their correct 
functionality. 
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A. CGaAs STATIC LOGIC CIRCUIT DESIGN 



CGaAs static logic circuit designs discussed in this chapter are similar to static CMOS 
logic circuit designs that implement the same logic functions. They dissipate a small 
amount of static power in addition to the dynamic power dissipated. Transistor gate widths 
have been chosen according to DC transfer curves to get the optimal noise margin. 
Transistor gate lengths for all circuits are 0.7 |lm. In this section, the design and HSPICE 
simulations of four different logic functions are performed. 

The CGaAs static circuit that generates the logic function F| = ((A + 5) + C) is 

shown in Figure 5.1. All transistor gate widths of the circuit are 10 pm. Maximum 
operating frequency of the circuit is 0.62 GHz when powered from to a 2.0 volt power 
supply and with input variable transition between 0.0 volts and 1.75 volts. The maximum 
frequency is achieved when the circuit output is loaded by two static inverters (fan-out of 
two). Average power consumed by the circuit at the maximum operating frequency is 8.69 
mW when the A input is connected to a pulse generator and both the B and the C inputs are 
tied to 0.0 volts to propagate the A input to the circuit output. Increasing the P-transistor 
width will not increase the maximum operating frequency but will increase the power 
dissipation. Input-output waveforms of the circuit operating at the maximum frequency are 
shown in Figure 5.2. 

The CGaAs static circuit that generates the logic function F 2 = ((A * B) • C) is 

shown in Figure 5.3. Analysis of the circuit was performed with the same power supply, 
input transitions and load as the circuit that generates logic function Fj. Input-output 
waveforms of the circuit are shown in Figure 5.4. Maximum operating frequency of the 
circuit is 0.83 GHz. The circuit consumes an average power of 10.39 mW at the maximum 
operating frequency. It should be noted that the maximum operating frequency of this 
circuit is higher than for the circuit that generates logic function Fj because the later has 
two series P-channel transistors which slow its speed. Also, the power consumption of this 
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circuit is higher than the circuit that generates logic function Fj because it operates at higher 
frequency, thus consumes more dynamic power (frequency dependent). 

The CGaAs static circuit that generates the logic function F 3 = ((^ + B)*C) is 

shown in Figure 5.5. The circuit is analyzed at the same power supply, input transitions and 
load as the previous two circuits. Maximum operating frequency of the circuit is 0.62 GHZ 
and the power consumption at this frequency is 9.49 mW. Input-output waveforms of the 
circuit operating at the maximum frequency is shown in Figure 5.6. 

Finally, the CGaAs static circuit that generates the logic function F 4 = ((>1 • B) + C) 

is shown in Figure 5.7. Analysis of the circuit was performed with the same above 
conditions. Input-output waveforms of the circuit are shown in Figure 5.8. The circuit has 
a maximum frequency of operation of 0.62 GHZ and consumes an average power of 8.73 
mW at the maximum frequency. 

Table 5.1 summarizes the maximum operating frequency, the average power 
consumption at maximum frequency of operation and the layout area of the static logic 
circuits designed in this section. The layout area listed in this table is just the transistor gate 
area. It does not include the area of interconnect between the transistors. 



Table 5.1: CGaAs Static Logic Circuit Performance 
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Function 
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FI 


0.62 
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56 


F2 


0.83 


10.39 
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56 


F3 


0.62 


9.49 
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56 


F4 


0.62 


8.73 


8 


56 
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Figure 5.4: Input-Output Waveform of CGaAs Static F 2 Generator 



93 



ily4mliii III iliitlii ill III imii ill Liiliitl 



Vdd 





Figure 5.6: Input-Output Waveform of CGaAs Static F 3 Generator 
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B. CGaAs DOMINO LOGIC CIRCUIT DESIGN 



Domino circuits consist of a dynamic N-transistor logic block followed by a static 
inverter. When the input clock signal is logic low (pre-discharge phase), the circuit output 
will discharge to the logic low level. When the clock switches to logic high (evaluation 
phase), the N-transistor logic block will evaluate the input signals and perform the specified 
logic function. The circuit output either stays at a logic low or charges to logic high 
(according to the input variables) through the output static inverter. Thus, the output will 
have at most one transition during evaluation, which prevents erroneous output states that 
can occur in simple dynamic logic schemes. Domino is not a complete logic family because 
it does not generate inverted functions. Also, Domino circuits are not completely dynamic 
because they contain static inverters. Inverted logic functions have to be re-expressed in a 
non-inverted expression to be represented in Domino logic. The principal of operation of 
Domino dynamic logic circuits is explained in Chapter I, Section E. 

In this section, the four logic functions mentioned in the beginning of the chapter are 
designed using CGaAs Domino logic, then simulated using HSPICE simulation tools. 
Maximum operating frequency of each logic circuit and the average power consumption at 
the maximum frequency are the parameters measured. All circuits are simulated using a 2.0 
volt power supply and inputs switch between 0.0 volts and 1.75 volts. Also, simulations are 
performed with the circuits loaded by two inverters (fan-out of two). Gate lengths for all 
transistors in the circuits are 0.7 |im. Gate widths of each transistor have been chosen to 
optimize the maximum frequency of the circuit and are written on the schematic. During 
simulation and operation, all inputs must be stable (unchanged) during the evaluation phase 
to prevent data corruption. 

The CGaAs Domino circuit that generates the logic function F ^ = {{A ■¥ B) -k- C) is 

shown in Figure 5.9. This inverted function must be converted to a non-inverted expression 
to be represented in Domino logic (inverted functions can not be represented in Domino 

logic family). The new expression for the function will be Fj = {{A +B)*C). The 
highest frequency of the circuit is obtained for transistor gate widths written on each 
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transistor of Figure 5.9. Transistor gate widths for the static inverters are 2 [Xm for N- 
channel transistors and 4 ^.m for P-channel transistors. Maximum operating frequency of 
the circuit is 1.61 GHZ. Average power consumed by the circuit at the maximum operating 
frequency is 3.54 mW when the C input is switching and both the A and B inputs are tied 
to 1.75 volts (logic high level) to propagate the effect of the C input to the circuit output. 
When the C input switches, the static inverter at this input will switch between logic levels 
and consume dynamic (switching) power in addition to the static power consumed. Thus, 
the total power consumption of the circuit increases when the C input switches. Input- 
output waveforms for the circuit at the maximum frequency of operation with the C input 
switching between logic levels is shown in Figure 5. 10. It can be seen from the waveforms 
that the circuit output always pre-discharges to logic low when the clock input is low (pre- 
charge phase). When the clock input is high (evaluation phase) the circuit evaluates the 
output node according to the input signals that are present. 

The CGaAs Domino circuit which generates the logic function F 2 = {{A • B) • C) 
is shown in Figure 5.11. The non-inverting expression for the function is 
F 2 = {{A + B) • C) for implementation in Domino logic. The maximum frequency of 

operation for the circuit is obtained for the transistor gate widths written on each transistor 
of Figure 5.11, while transistor gate widths for the static inverters are 2 |im for N-channel 
transistors and 4 |xm for P-channel transistors. The maximum operating frequency of the 
circuit is 1.92 GHz and the average consumed power at this frequency is 4.1 mW when the 
C input is switching and both the A and B inputs are tied to logic low level. Input-output 
waveforms for the circuit at the maximum operating frequency are shown in Figure 5.12. 

The CGaAs Domino logic circuit which generates the function F 3 = ((/4 + B)*C)is 

shown in Figure 5.13. Because Domino is a non inverting logic, the function has to be 
converted to a non-inverted expression to match the characteristics of the Domino logic 

family. The non-inverted expression of the function is F 3 = ((/I - 1 - B) + C) . The 
maximum operating frequency of the circuit is 1.92 GHz and the average consumed power 
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at this frequency is 3.97 mW when the C input is switching and both the A and B inputs are 
connected to logic level low. Input-output waveforms for the circuit at the maximum 
operating frequency are shown in Figure 5.14. 

The CGaAs Domino circuit which generates the logic function F ^ = ((/4*B)-t-C) 
is shown in Figure 5.15. The non-inverted expression for the logic function is 

= (( A • B) • C) . Maximum frequency of operation of this circuit is 1.62 GHz and the 

average consumed power at this frequency is 3.6 mW when the C input transitions between 
logic levels and the A and B inputs are connected to a logic level high to propagate the 
effect of the C input to the circuit output. Input-output waveforms for the circuit at the 
maximum operating frequency are shown in Figure 5.16. Table 5.2 summarize the 
performances of all Domino logic circuits designed in this section. 



Table 5.2: CGaAs Domino Logic Circuit Performance 



Generated 

Function 


Maximum 

Frequency 

[GHz] 


Average 

Consumed 

Power 

[mW] 


Transistor 

Count 


Layout 
Area [|im^] 


FI 


1.61 


3.54 


9 


19.6 


F2 


1.92 


4.104 


9 


21 


F3 


1.92 


3.97 


9 


19.6 


F4 


1.61 


3.6 


9 


22.4 
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Figure 5.10: Input-Output Waveform of CGaAs Domino Logic Fj Generator 
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Figure 5.12: Input-Output Waveform of CGaAs Domino Logic F 2 Generator 
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Figure 5.13: CGaAs Domino Logic Circuit to Generate Function F3 
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Figure 5.14: Input-Output Waveform of CGaAs Domino Logic F 3 Generator 
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Figure 5.15: CGaAs Domino Logic Circuit to Generate Function F4 
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Figure 5.16: Input-Output Waveform of CGaAs Domino Logic F 4 Generator 
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C. CGaAs N-P DOMINO LOGIC CIRCUIT DESIGN 



N-P Domino logic circuits use both N-channel transistors and P-channel transistors to 
evaluate a logic expression. N-P Domino logic gates are implemented using N-type and P- 
type transistor blocks. N-type transistor block outputs precharge to Vdd while P-type 
transistor block outputs pre-discharge to zero volts during their precharge phases (zipper 
operation). During the precharging phase, all transistor gates of the n-logic blocks are 
discharged to ground and they are turned off through the preceding p-logic blocks. Also, 
the transistors of the p-logic blocks will be turned off by the preceding n-logic blocks. 
Unlike Domino logic, N-P Domino logic is a complete logic family because inverted 
functions can be implemented. It is required that all inputs to N-transistor logic blocks be 
stable (unchanging) during the evaluation phase (clock is high). Also, all inputs to P- 
transistor logic blocks are required to be stable during the evaluation phase of P sections 
(clock is high). A complete explanation and basic operations for N-P Domino logic family 
is presented in Chapter I, Section E. 

In this section, the four logic functions selected previously are designed using CGaAs 
N-P Domino logic gates. These circuits are simulated using HSPICE simulation tools to 
evaluate their performance. The maximum frequency of operation and power consumption 
are the parameters measured. Because P-channel transistors are used in evaluating the 
expression, it is expected that the maximum frequency of operation of this logic family 
should be less than that of Domino logic circuits. Gate lengths of all transistors in the 
designed circuits are 0.7 |im. In HSPICE simulations, only one input changes at a time. The 
other two inputs are set to propagate the effect of the switching input to the circuit output. 

A CGaAs N-P Domino circuit that generates the logic function Fj = ((A -f- B) + C) 

is shown in Figure 5.17. Transistor gate widths are in micrometers and written on each 
transistor of the figure. If this circuit is inserted in a chain of N-P Domino logic gates and 
its inputs are driven from another N-P Domino circuit, the transistors Q1 and Q4 can be 
removed. In this case, it is guaranteed that all transistors of the N-transistor logic (p-logic) 
are turned off during the precharge phase and there is no current path from to ground. 
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The designer has to make sure that inputs of P-sections are driven only from outputs of N 
sections. If an output of a N section (P section) is required to be connected to an input of 
another N section (P section), it has to go through a static inverter. The circuit in Figure 
5.17 has three series P transistors in its P section which causes the output to charge very 
slowly during the evaluation phase. The widths of these P-transistors have to be increased 
to compensate for the slow charging, which increases the power consumption. Maximum 
operating frequency of the circuit is 0.82 GHz. The average power consumption at the 
maximum operating frequency is 1.646 mW. Input-output waveforms of the circuit at the 
maximum operating frequency are shown in Figure 5. 18. 

A CGaAs N-P Domino circuit that generates the logic function Fj - {{A • B) • C) 

is shown in Figure 5.19. The maximum frequency of operation of the circuit is 1.2 GHz and 
the average power consumption at this frequency is 1.863 mW. Maximum operating 
frequency of this circuit is higher than the circuit in Figure 5.17 because it has only two 
series P transistors in the P-section, compared to three for the other circuit. Also, the power 
consumption is less than that of Figure 5.17 due to the use of narrower P-channel 
transistors. Input-output waveforms of the circuit at the maximum operating frequency are 
shown in Figure 5.20. 

A CGaAs N-P Domino circuit that generates the logic function F 3 = {{A + B) • C) 

is shown in Figure 5.21. Transistor gate widths are in micrometers and written on each 
transistor in the figure. The maximum operating frequency of this circuit is 1.2 GHz and it 
consumes an average power of 1.682 mW at this frequency. Input-output waveforms of the 
circuit at the maximum operating frequency are shown in Figure 5.22. 

A CGaAs N-P Domino circuit that generates the logic function = ((/l • B) + C) 

is shown in Figure 5.23. The maximum frequency of operation of this circuit is 0.82 GHz 
and it consumes an average power of 2.484 mW at this frequency. The maximum frequency 
of the circuit is low because it has three series P-channel transistors in the P section. Input- 
output waveforms of the circuit at the maximum operating frequency are shown in Figure 
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5.24. The performance of all CGaAs N-P Domino logic circuits designed in this section are 
summarized in Table 5.3. 



Table 5.3: CGaAs N-P Domino Logic Circuit Performance 



Generated 

Function 


Maximum 

Frequency 

[GHz] 


Average 

Consumed 

Power 

[mW] 


Transistor 

Count 


Layout 
Area [|J.m^] 


FI 


0.82 


2.487 


8 


22.4 


F2 


1.2 


1.863 


8 


19.6 


F3 


1.2 


1.682 


8 


16.8 


F4 


0.82 


2.484 


8 


23.8 
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Figure 5.17: CGaAs N-P Domino Logic Circuit to Generate Function FI 
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Figure 5.18: Input-Output Waveform of CGaAs N-P Domino Fj Generator 
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Figure 5.19; CGaAs N-P Domino Logic Circuit to Generate Function F2 
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Figure 5.21: CGaAs N-P Domino Logic Circuit to Generate Function F3 
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Figure 5.22; Input-Output Waveform of CGaAs N-P Domino Logic F 3 Generator 
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Figure 5.23: CGaAs N-P Domino Logic Circuit to Generate Function F4 
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Figure 5.24: Input-Output Waveform of CGaAs N-P Domino Logic F 4 Generator 



D. CGaAs TWO-PHASE DYNAMIC LOGIC (TPDL) CIRCUIT DESIGN 

TPDL logic is similar to N-P Domino logic because of its ‘zipper’ operation. Unlike 
N-P Domino logic, TPDL uses only N-channel transistors in the evaluation part of the 
circuit. P-channel transistors are used only for precharging the output nodes. TPDL logic 
requires two clock phases ((|)i and 4)2 non-overlapped in the logic low level) and their 
complements for proper operation. TPDL circuits consist of 4>i logic-blocks and 4>2 logic- 
blocks. Each logic-block is preceded by one pass gate at each input. The output from each 
logic block can be sampled (read) at the end of its evaluation phase. Output of a 4>i (4>2) 
block can not be connected to an input of another 4>i (4>2) block or fed back to itself. The 4>i 
clock phase and its complement control the operation of 4>i logic blocks, while the 4>2 clock 
phase and its complement control the operation of 4*2 logic blocks. 

In this section, the design and simulation of the previously selected four two-level 
functions are completed using CGaAs TPDL logic. The circuits are simulated with a 2.0 
volt power supply and inputs transition between 0.0 volts and 1.75 volts. Gate lengths of 
all transistors are 0.7 |lm. The precharging P-channel transistors have gate widths of 4 |im, 
while all N-channel transistors in the evaluation block have gate widths of 2 |im. Pass gate 
transistor widths are 2 |im for both N- and P-channel transistors. 
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A CGaAs TPDL circuit which generates the logic function F ^ - ((/l+fi) + C) is 
shown in Figure 5.25. The A and B inputs are applied to a (t)i block, while the C input is 
applied to a 4>2 block. The A and B inputs must be stable (unchanged) during the evaluation 
phase of the 4)i logic-block ((t)i clock phase is logic high), while the C input must be stable 
during the evaluation phase of the (1)2 logic-block ((1)2 clock phase is logic high). The circuit 
of Figure 5.25 was designed and simulated using HSPICE. The circuit output is available 
after one complete cl(xk cycle from applying the inputs. Maximum operating frequency of 
the circuit is 2.38 GHz. The circuit consumes an average power of 1.998 mW at the 
maximum frequency of operation. Input-output waveforms of the circuit, when operating 
at the maximum frequency, are shown in Figure 5.26. The A input switches while both the 
B and C inputs are logic low to propagate the effect of the A input to the circuit output. The 
circuit output is precharged to Vjjd when ( 1>2 is logic low and is evaluated when ( 1)2 is logic 
high (the output is taken from the ( 1)2 logic block). 

The limitation on the maximum frequency of operation occurs when the output node 
is not pulled up to during the precharge phase, this is usually caused by a large 

capacitive load on the output node. This problem can be solved by increasing the time 
period in which the output is pulled high ((1)| and (1)2 precharge times). This will increase the 
clock period and decrease the operating frequency. Increasing the width of the precharging 
P-channel transistors will solve the above problem but will increase the power consumption 
of the circuit (trade off). Using pass transistors instead of pass gates at logic-block inputs 
have been tried. N-channel transistors are poor in transmitting a logic 1 and P-channel 
transistors are poor in transmitting a logic 0. Using either type by itself will decrease the 
operating voltage range and consequently decrease the operating frequency. 

A CGaAs TPDL circuit that generates the logic function F 2 = ({A • B) • C) is 

shown in Figure 5.27. The maximum operating frequency of the circuit is 1.92 GHz and the 
average power consumption is 1.82 mW at that frequency. Input-output waveforms for the 
circuit at the maximum frequency of operation are shown in Figure 5.28. In this figure, the 
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C input switches between logic levels while both the A and B inputs are tied to logic low 
to propagate the effect of the C input to the circuit output. 

A CGaAs TPDL circuit that generates the logic function F 3 = ((/4+5)*C) is 

shown in Figure 5.29. The maximum operating frequency of the circuit is 1.92 GHz and it 
consumes an average power of 1 .75 mW at that frequency. Input-output waveforms for this 
circuit at the maximum frequency of operation are shown in Figure 5.30. In this figure, the 
A input switches between logic levels, the B input is tied to logic low and the C input is tied 
to logic high. The B and C inputs are chosen to propagate the effect of the A input 
transitions to the circuit output. 

A CGaAs TPDL circuit that generates the logic function = ((y4*5)-t-C) is 

shown in Figure 5.31. The maximum frequency of operation for the circuit is 1.92 GHz and 
it consumes an average power of 1.82 mW at that frequency. Input-output waveforms for 
the circuit at the maximum frequency of operation are shown in Figure 5.32. In this figure, 
the A input switches between logic levels, the B input is held constant at logic high and the 
C input is logic low. The B and C inputs are chosen to propagate the effect of the A input 
transitions to the circuit output. 

Table 5.4 summarizes the performances of the TPDL circuits designed in this section. 
During simulation, one input of the circuit is switching while the other two inputs are 
chosen to propagate the effect of the switching input to the circuit ouqjut. 



Table 5.4: CGaAs TPDL Circuit Performance 



Generated 

Function 


Maximum 

Frequency 

[GHz] 


Average 
Consumed 
Power [mW] 


Transistor 

Count 


Layout 
Area [[im^] 


FI 


2.38 


2.38 


16 


25.2 


F2 


1.92 


1.82 


16 


25.2 


F3 


1.92 


1.75 


16 


25.2 


F4 


1.92 


1.82 


16 


25.2 
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Figure 5.26: Input-Output Waveform of CGaAs TPDL Fj Generator 
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Figure 5.27: CGaAs TPDL Schematic Circuit To Generate Function F2 
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Figure 5.28: Input-Output Waveform of CGaAs TPDL F 2 Function Generator 
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Figure 5.29; CGaAs TPDL Logic Circuit to Generate Function F3 
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Figure 5.30; Input-Output Waveform of CGaAs TPDL F 3 Function Generator 
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Figure 5.31; CGaAs TPDL Logic Circuit to Generate Function F4 




Figure 5.32: Input-Output Waveform of CGaAs TPDL F 4 Function Generator 
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E. COMPARISON BETWEEN CGaAs LOGIC FAMILIES 



In the previous sections, the design and simulation of four two-level functions was 
performed using four different logic families. The logic families studied in this chapter are 
Static logic, Domino dynamic logic, N-P Domino dynamic logic and Two-Phase Dynamic 
Logic (TPDL). Due to the excellent performance of the TPDL circuits, the TPDL circuit 
performing the logic function FI was implemented using CADENCE tools. This circuit 
will be fabricated to test its performance, as will be seen in Chapter VII. Also, the static 
logic circuit for the same function is laid implemented and will be fabricated and tested for 
comparison with the TPDL circuit. 

In this section, a comparison between the studied CGaAs logic families is performed. 
Maximum operating frequency, average power consumption and layout area are compared. 

The function Fj = ({A + B) + C) will be taken as a study case and Table 5.5 summarizes 
the comparison results [50]. 

In CGaAs static logic circuits, the transconductance ratio of N-channel transistors to 
P-channel transistors has to be properly adjusted to achieve an acceptable noise margin. 
The design of the static logic circuit is quite easy and similar to the design of Silicon CMOS 
circuits implementing the same logic function. The drive capability of this family is quite 
low because the load capacitance is connected directly to the output. Thus, transistor gate 
widths of the driver circuit have to be increased to increase the drive capability, which 
increases the power consumption and the layout area. Static logic has the lowest maximum 
frequency of operation, the highest power consumption and the largest layout area of all the 
studied logic families. 

CGaAs Domino logic is not a complete logic family because inverted functions can not 
be implemented in this family. Also, it is not completely dynamic because a static inverter 
is required at each gate output. Its design is more complicated than the static circuit design. 
Also, it requires a clock signal for proper operation. Domino circuits consist of a dynamic 
logic block followed by a static inverter. Transistor gate widths in the dynamic logic block 
can be chosen for minimal size because they only drive the static inverter. The main 
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disadvantage of Domino logic is that it is not suitable for pipelined system architectures. 
The maximum frequency of operation is more than double that of the static circuits. Also, 
the layout area is about one-third of the static circuit layout area and the power consumption 
is much less than that of static circuits implementing the same logic function. 

CGaAs N-P Domino dynamic logic is a complete logic family and is completely 
dynamic (has no static inverters). It uses both N-channel and P-channel transistor 
evaluation blocks. Using the slow P-channel transistors in evaluating the logic reduces the 
speed of this family. N-P Domino logic requires a clock signal and its complement for 
proper operation. The power consumption and the layout area shown in Table 5.5 does not 
include the clock generator and driver power consumption and layout area. N-P Domino 
logic maximum frequency of operation is much lower than that of the Domino logic due to 
the use of slow P-channel transistor in evaluating the logic. Layout area for both Domino 
and N-P Domino circuits is comparable. N-P Domino power consumption is lower than 
that for the Domino circuit due to the reduction in the dynamic power (frequency 
dependent). 

CGaAs TPDL logic has the highest maximum operating frequency, close to double 
that of Domino and about four times that of static logic. The layout area is comparable to 
that of Domino and N-P Domino logic and less than half that for static circuits. Even 
though the power consumption listed in Table 5.5 is the power consumed at the maximum 
frequency, TPDL circuit power consumption is significantly below that of the static circuit 
(less than one third). TPDL logic uses only the fast N-channel transistors in evaluating a 
logic function and uses the slow P-channel transistors only for precharging the output 
nodes. It does not consume any static power except for a small amount of leakage power 
(which decreases when lowering the supply voltage). Most of the consumed power is 
dynamic power. This logic family requires two clock phases and their complements for 
proper operation. Power consumption and layout area listed in Table 5.5 does not include 
those for the clock generator and driver circuits. Also, this family requires routing the four 
clock signals (two clock phases and their complements) to all the circuits, which increases 
the interconnect area and the design complexity. 
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For a fa ir comparison, the power consumption should be calculated at the same 
frequency. If this is done, TPDL power consumption at the static-circuit maximum 
operating frequency will be much lower than the value listed in Table 5.5. Moreover, at this 
low frequency, related to the maximum frequency of TPDL circuits, TPDL circuits would 
be powered from a lower power supply voltage and would be operated with lower input and 
output signal transitions, which reduces the consumed power. Also, for the power 
consumption comparison to be fair, power consumed by the clock generator (to generate 
4>1, 4>2 and their complements required for TPDL circuits) should be considered. 

The power consumption as a function of frequency for all the studied logic families 
implementing the function FI are shown in Figure 5.33. This figure illustrates the low 
power consumption of the TPDL design compared to all other designs. Looking at the 
above comparison, one can conclude that TPDL circuits are the best logic family for 
building the next generation of high density circuits. CGaAs TPDL circuits are also suitable 
for pipelined architectures. When used in pipelined architectures, they do not require 
storage elements between stages. Logic levels are stored on transistor gates of the N- 
channel transistor logic blocks during the precharging phase. During the precharging phase, 
the pass gates preceding the logic-blocks are turned off to protect the stored data from 
corruption. 



Table 5.5: Comparison of CGaAs Logic Families 



Logic Family 


Maximum 

Frequency 

[GHz] 


Average 
Power [mW] 


Layout Area 
[[lm2] 


Static 


0.62 


8.69 


56 


Domino 


1.61 


3.54 


19.6 


N-P Domino 


0.82 


2.487 


22.4 


TPDL 


2.38 


2.38 


25.2 
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Figure 5.33: Power Consumption of Logic Circuits Implementing Function FI 



F. LOADING EFFECTS ON CGaAs LOGIC FAMILIES 

In this section, the effect of increasing the output load on the performance of the 
different CGaAs logic families will be discussed. The four logic families studied are static 

logic, Domino logic, N-P Domino logic and TPDL. The function Fj = ((A + fi) + C) 

discussed in the previous sections will be used again. All logic circuits will be powered 
from the same power supply (2.0 volts). Also, the same input transitions will be applied to 
all circuits (input transitions between 0.0 volts and 1.75 volts). The maximum operating 
frequency, as a function of the number of loads, is plotted in Figure 5.34 for all logic 
families. From this figure, it can be seen that the maximum frequency of operation is 
inversely proportional to the number of loads in a nonlinear relation. The maximum 
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operating frequency of the TPDL family is always the highest frequency over all the 
studied logic families. Also, when the number of loads is 10 for the TPDL circuit, its 
maximum frequency is still higher than that for the static circuit with one load (about two 
times). 




G. EFFECT OF POWER SUPPLY VOLTAGE ON CGaAs LOGIC FAMILIES 

In this section, the effect of changing the power supply and input signal voltage on the 
maximum operating frequency and power consumption of aU CGaAs logic families is 
studied. This will stress the advantage of the TPDL family over the other logic families. 

The logic function Fj = ((A + B) + C) will be used as a study case. 

CGaAs circuits implementing the logic function Fj have been designed and simulated 
using all four logic families. Simulations were accomplished with HSPICE when all 
circuits were loaded by two inverters (fan-out of two). The power supply voltage at which 
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all circuits were simulated were 2.00, 1.75, 1.50, 1.25 and 1.00 volts. When the power 
supply voltage was 2.0 volts, the input signal transition was 1.75 volts peak-to-peak. 
Otherwise, the power supply voltage was equal to the peak-to-peak input signal 
transition .The highest power supply voltage applied to the circuit was limited by the 
transistor drain-to-source leakage current. The highest gate voltage transition was limited 
by the transistor gate leakage current. The maximum operating frequency, as a function of 
power supply voltage for all logic families, is plotted in Figure 5.35. The power 
consumption, as a function of the power supply voltage for all logic families, is plotted in 
Figure 5.36. It can be seen from these two figures that the TPDL family has the highest 
maximum operating frequency with the lowest power consumption. This agrees with the 
results obtained in the previous sections. When the power supply is decreased from 2.0 
volts to 1.75 volts, while keeping input signal-transitions at 0.0 volts to 1.75 volts, the 
maximum operating frequency of all dynamic logic families are not decreased, while for 
static logic it does decrease, as shown in Figure 5.35. Also, decreasing the power supply 
from 2.00 to 1.75 volts will decrease the power consumption for all logic families, as shown 
in Figure 5.36. Decreasing the power supply voltage and the input swings beyond 1 .75 volts 
will reduce both the maximum frequency and the average power consumption of all logic 
families. 

From this section and the previous two sections, one can conclude that the performance 
of TPDL logic is superior to all other dynamic and static logic families. The TPDL family 
will be chosen for designing and implementing the circuits of the following chapters. 
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Figure 5.35: Maximum Operating Frequency of Logic Families 




Figure 5.36: Power Consumption of Logic Families 
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VI. DESIGN OF COMPLEMENTARY GaAs MULTI-LEVEL TPDL 
AND STATIC LOGIC CIRCUITS 



A multi-level logic circuit is a block of logic that can be divided into subblocks such 
that each subblock can be considered as a separate circuit. A multi-level logic function is a 
function that has multiple inputs and outputs and that requires a multi-level circuit for 
implementation. A Four-Bit Carry Lookahead Adder (4Bit-CLA) is an example of a multi- 
level logic function. In this chapter, a 4-Bit CLA will be designed and implemented using 
a multi-level logic circuit. In Chapter V, it was shown that Two-Phase Dynamic Logic 
(TPDL) is the optimal dynamic logic family. It has the highest maximum operating 
frequency and the lowest power consumption of all the studied dynamic logic families. The 
low power consumption of TPDL logic allows complementary dynamic GaAs circuits to 
enter the LSI and VLSI era. In this chapter, TPDL is the only dynamic logic family that will 
be used in circuit designs and implementations. However, circuits designed and 
implemented using TPDL will also be designed and implemented in static logic for 
comparison purposes. 

In this chapter, an optimal designs for 4-Bit CLAs using static logic and TPDL logic 
are performed. A comparison is then performed between the two designs for maximum 
speed, total power consumption and layout area. Section A gives an overview of the carry 
lookahead adder and the arithmetic equations. The design and analysis of the CGaAs 4-Bit 
CLA, using static logic and piplined static logic, is explained in Section B. Section C shows 
the design and analysis of the same adder using TPDL. The comparison between the static 
logic and TPDL designs for speed, power consumption and layout area is explained in 
Section D. 

A. CARRY-LOOKAHEAD ADDER OVERVIEW 

All of the various arithmetic operations (add, subtract, multiply and divide) can be 
implemented by appropriate combinations of the add function. Thus, addition is the 
universal data operation for a computer Arithmetic Logic Unit (ALU). In the following 
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subsections, the basic design and the operation of different adder configurations will be 
discussed. 

1. Basic Add/Subtract logic 

A Full Adder (FA) is the basic cell that is normally used to perform addition. In this 
subsection, a one-bit FA, that can be used to implement a n-bit adder, is described. A one- 
bit full adder adds two binary digits, Aj and Bj, and one carry input Cj, to produce a sum 
output Sj and a carry output Q+i. The sum output has the same significance as the three 
inputs, while the carry output is one bit more significant. The two outputs are related to the 
three inputs by the following boolean Equations: 

s,- = A,eB,ec,- (6.1) 

C,-^i = + BjCj + (6.2) 

The basic adder cell can be modified to become a 4-input Controlled Add/Subtract cell 
(CAS) by adding another input P which is used to control the add (P = 0) or subtract (P = 
1) operations. In the case of subtraction, input Cj is called borrow-in and the Cj^.| output is 
called borrow-out. The input-output relationship of a CAS cell is specified by the following 
pair of Boolean Equations. 

Sj = A,eBj©P©Cj (6.3) 

Cj^l= ((A,-HCj)-(Bj©P))-»-AjC,. (6.4) 

When P = 0, Bj © P = Bj and the two sets of Equations (6. 1/ 6.2 and 6.3/ 6.4) are identical. 

When P = 1, Bj © P= Bi and Equations 6.3 and 6.4 become 

Sj = Aj©F ©Cj (6.5) 

Cj^l = AjF + AjF-HAjCj (6.6) 

Any number of FA cells can be cascaded to form n-bit ripple adder. The carry in of a 
cell is driven from the carry out of the next least significant cell. Carry in to the least 
significant cell is driven from the carry input to the circuit. Carry out of the most significant 
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cell is the carry ouqjut of the whole adder. Each full adder has a latency of two gate-delays. 
Thus, the n-bit adder will have a 2*n*gate delay. For large values of n, the latency of the 
ripple adder is very high, which slows the speed of computations. There are many 
techniques to reduce the adder latency. One of these techniques is the use of carry 
lookahead, which will be discussed in the following subsection. 

2. Carry Generate, Propagate and Lookahead Functions 
The speed of a digital arithmetic processor depends on the speed of the adders used. 
The carry lookahead adder described here is used to speed up carry propagation in the 
addition operation. The carries entering all bit positions of a parallel adder are generated 
simultaneously using additional logic circuits. This results in a constant addition time 
independent of the adder length. However, for long words, carry lookahead is usually 
performed in 4-bit groups to reduce implementation costs. 

Let the vectors A=An.iAjj .2 AjAq and B=Bn.iBn. 2 ....BiBo be the augend and 

addend inputs to n-bit adder. Let C;.! be the carry input to the i*^ bit position. The carry 
input to the least significant bit position is denoted as C.^ . Let S j and Cj be the sum and carry 

outputs of the i^ stage, respectively. Two auxiliary functions are defined as follows; 



= /I, • B, 


(6.7) 


= A,9B: 


(6.8) 



The carry generate function Gj reflects the condition that the carry originates at the i^ 

stage. The function Pj, called carry propagate, is true when the i*^ stage will pass the 
incoming carry Cj.^ to the next higher stage. Substituting Pj and Gj into Equations 6.1 and 
6.2 (before reduction) to obtain S j and C; in terms of Pj and Gj. 

5, = 

5,= P,.0C,_i (6.9) 

C,. = >l,.-5j-h(/l,.0B,.)-C,_i 
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( 6 . 10 ) 



C i — G, + F, ■ - 1 

The above Equations reveal the fact that all Pj and Gj for i = 0, 1, n-1 can be 

generated simultaneously from the external inputs A and B. Also, all Sj and Q can be 
generated simultaneously. The recursive formula of Pj can be appUed to Equation 6.10 to 



obtain all C; as follows: 

Co = Go + C_i-Fo (6.11) 

Cj = Gj + Cq • Fj 

Ci= Gi + Go-Fi+C.iFqFj (6.12) 

C 2 = G 2 + Cj • F 2 

G 2 = G 2 + Gj ■ F 2 + Go • Fj • F 2 + C_1 • Fq • Fj • F 2 (6.13) 

^n-\ - 1 + 1 +•••■•■ 6T_iFoFi...F„_ 1 (6.14) 



These equations can be implemented using a carry lookahead unit. 

For n=4, two additional terminal functions called block carry generate G and block 

carry propagate P* can be used to form an additional circuit. This new design allows the 
connection of 4-bit adder “slices” to be connected together to form an 8-bit or multiple 4- 
bit adder. This design is called Block Carry Lookahead Adder (BCLA). 



o 

11 

* 


^2 ■ ^3 


(6.15) 


• Fj-t-Gj 


P2'P-i + GQ - P^ - P2' P-i 


(6.16) 




*"+P* C 


(6.17) 



Where Cj^ and Cq„, are the carry in and carry out of the 4-Bit adder slice, respectively. 

The block diagram of a 4-bit carry lookahead adder is shown in Figure 6.1. The 
propagate and generate block is the circuit that produces the propagate and generate carry 
(Pj and Gi) required for the consecutive blocks. Pj is generated using an XOR gate while Gj 
is generated using an AND gate. The Cj and Sj blocks generate the summation and signals 
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out signals, respectively, according to the above equations. The detailed design of these 
blocks, using both static logic and TPDL, will be explained in the next two sections. 




Figure 6.1: CGaAs 4-Bit Blocked CLA Logic Diagram 
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B. CGaAs STATIC 4-Bit CLA CIRCUIT DESIGN 



In this section, the design of a 4-Bit CLA circuit using static logic will be explained in 
detail. The circuit in Figure 6.1 was designed and simulated using HSPICE simulation 
tools. Each block of the circuit is designed and simulated separately and then optimized for 
layout area and maximum operating frequency. Finally, all blocks are integrated to form 
the 4-Bit CLA. All the transistors used in these blocks have a gate length of 0.7 |lm. 

AJTO gates are used to generate the carry-generate signals (Gj) according to Equation 
6.7 . An AND gate consists of a NAND gate followed by an inverter. The design of the static 
NAND gates and the static inverters are discussed in Chapter IE. XOR gates are used to 
generate the carry-propagate signals (Pj), according to Equation 6.8, and are also explained 
in Chapter DI. Summation terms (Sq, S], $2 and S3) are generated using XOR gates, 
according to Equation 6.9. The circuit that generates the carry function Cq from the inputs 
Pq, Gq and C.i is shown in Figure 6.3. The Cj generator circuit is shown in Figure 6.4, while 

the C2 generator circuit is shown in Figure 6.5. According to Equation 6.15, P requires a 
four-input AND gate which is designed using a four-input NAND gate followed by an 

inverter, as shown in Figure 6.6. The circuit that generates the G* function is similar to the 
one that generates C2, shown in Figure 6.5. The circuit to generate Com from the inputs C. 

9k 9k 

1, P and G is similar to that in Figure 6.3, which generates Co- Transistor gate widths are 
written on each transistor in aU the schematics. These circuits are all designed similarly to 
static CMOS circuits. 

The above circuits are integrated to form the static 4-Bit CLA shown in Figure 6. 1 . An 
exhaustive test structure has been used to check the functionality of the CLA. Input-output 
waveforms of the circuit, running at the maximum operating frequency, are shown in 
Figure 6.2. The simulation test structure was chosen such that all the summation (Sq. Sj, $2 
and S3) and the carry out (Cqui) outputs are switching for every input transition. Figure 6.2 
was plotted for the following input signal test structure: A3A2A1A0 = 1010, B3B2B1B0 = 
0101 . The carry in signal is generated from a pulse generator which switches between logic 
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low and logic high. Propagation delay, measured from the time of applying the input pulses 
to the time the outputs switch, determines the maximum operating frequency of the circuit. 
For the selected test structure, when the input carry in switches from logic low to logic high, 
all summation signals switch from high to low and the carry out signal switches from low 
to high. The CGaAs 4-Bit CLA was simulated with a 1.75 volt power supply. Each output 
of the circuit was loaded by two static inverters (fan-out of two). The input logic-low level 
is 0.0 volts while the logic-high level is 1.75 volts. 

Due to the difference in the propagation paths of all the summations and carry out 
signals, they have different propagation delays. Therefore, the maximum frequency is 
limited by the longest signal path (longest propagation delay). The longest propagation 
delay is for the output S 3 . The measured propagation delay, when the carry in switches from 
logic high to logic low until the time when the summation ouq>ut S 3 switches from logic 

low to logic high, is 1.45ns. The propagation delay measured from the change in carry in, 
from logic low to logic high, to the change in the summation output S 3 , from logic high to 

logic low, is 1 .9ns. The duty cycle of the applied input signal should be equal to or longer 
than the longest propagation delay of the circuit to prevent race conditions. This will limit 
the maximum frequency of the input signal to 260 MHz (l/(2*1.9ns)). The 4-Bit CLA 
circuit consumes an average power of 26 mW at the maximum operating frequency. 

The summation and the carry out signals do not arrive at the circuit output 
simultaneously. Thus, the circuit requires a register at the output to hold the information 
and apply it to the next stage simultaneously. This will add circuitry and increase the layout 
area, the transistor count and the power consumption of the circuit. Also, the maximum 
operating frequency of the circuit will be decreased due to the added delay through the 
register file. 

The piplined static 4-Bit CLA solves the above problems. A piplined adder increases 
the maximum frequency of operation but at the same time increases the transistor count, 
the power consumption and the layout area. The logic diagram of a three-stage, piplined, 
static, 4-Bit CLA is shown in Figure 6.7. Three stages of pipeline registers are added to 
overcome the problem of propagation delay. A pipeline register consists of static D Flip- 
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Flops, one for each bit. The design and analysis of a CGaAs static D Flip-Flop is presented 
in Chapter VI, Section A. The use of these pipeline registers will assure that all summation 
and carry out output signals will be delivered to the output terminals simultaneously. The 
number of transistors used in the pipelined adder circuit is 460. The circuit was simulated 
using HSPICE to measure its performance. The maximum frequency of operation is limited 
by the longest stage delay. The simulation power supply voltage was 1.75 volts and the 
input signals switch between 0.0 volts and 1.75 volts. All the circuit outputs are loaded by 
two static inverters (fan-out of two). The maximum operating frequency of the circuit is 
550 MHz (more than double that for the static design). The power consumption of the 
circuit is 77.4 mW at the maximum frequency of operation. The input-output waveforms 
of the circuit at the maximum frequency are shown in Figure 6.8. 
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Figure 6.3: CGaAs Static Circuit for Generating the Function Cq 
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Figure 6.5: CGaAs Static Circuit for Generating the Function C 2 
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Figure 6.7; CGaAs Pipelined Static 4-Bit CLA Lx)gic Diagram 
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Figure 6.8: Input-Output Waveforms of CGaAs Pipelined Static 4-Bit CLA 



C. CGaAs TPDL 4-Bit CLA CIRCUIT DESIGN 

Using TPDL increases the throughput of the CGaAs 4-Bit CLA circuit. The logic 
diagram of the CGaAs TPDL 4-Bit CLA is shown in Figure 6.9. AU the logic blocks of this 
figure will now be explained in detail. The carry generate signals (Gq, Gj, G 2 and G3) are 

created according to Equation 6.7 using TPDL AND gates. A TPDL AND gate consists of 
a TPDL NAND gate followed by a TPDL inverter. Both gates are explained in Chapter III. 
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According to Equation 6.8, the carry propagate signals (Pq, Pi. P 2 P3) are generated 
using TPDL XOR gates which are also explained in detail in Chapter HI. Both the TPDL 
AND gate and TPDL XOR gate have a fUl time of one clock cycle. I.E., all carry-propagate 
and carry-generate signals are generated one clock period after the inputs Aj and Bj are 
applied. A one-clock delay TPDL D Rip-Flop is explained in Chapter IV, Section B. The 
D Flip-Flop output is delayed by one clock period and used to align the input signals of the 
next logic block to be in phase. Pj and Gj signals arrive at the Cj inputs delayed by one clock 
from the time of applying the input vectors Aj and Bj. The carry in signal is applied to Cj 
through a delay block, as shown in Figure 6.9, to be in phase with all Pj and Gj signals. A 
one clock period delay is applied to both inputs of the Sq generator block to force all adder 
outputs to be in phase. The S; generator blocks in Figure 6.9 are designed using TPDL XOR 
gates according to Equation 6.9. The Cq generator block is designed according to Equation 
6.11 using the TPDL logic circuit of Figure 6.11. The TPDL circuit which generates the 
signal Cj is designed as shown in Figure 6.12, according to Equation 6.12. Figure 6.13 
shows the TPDL logic circuit used to generate the signal C2, according to Equation 6.13. 
The TPDL circuit that generates the function P*, according to Equation 6.15, is shown in 
Figure 6.14. The circuit that generates the function G*, according to Equation 6.16, is 
similar to the circuit that generates C2 and shown in Figure 6. 13. Cqu, is generated from the 

inputs Cuj, P and G , according to Equation 6.17, using a circuit similar to the one that 
generates Cq and is shown in Figure 6. 1 1 . All these TPDL logic circuits are designed and 
optimized for layout area and speed before integration into the adder circuit. The P* and 
G* blocks are added for two reasons. The first reason is to turn the circuit into a block 
adder. Tlie second reason is to reduce the number of series N-channel transistors in the 
circuit P2 which uses a four-input AND gate instead of a five input AND gate. This 
increases the maximum operating frequency of the circuit. 

The fill time of the TPDL 4-Bit CLA circuit is 3 clock periods (the outputs are 
available three clock periods after applying the inputs). The circuit was simulated with 
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HSPICE to measure the performance. The power supply voltage used in the simulation was 
1 .75 volts. Input logic low and logic high levels were 0.0 volts and 1 .75 volts, respectively. 
All output terminals of the circuit were loaded by two TPDL inverters (fan-out of two). The 
circuit was tested exhaustively to examine its functionality. The maximum operating 
frequency of the circuit is 1 .22 GHz and it consumes an average power of 6 1 .79 mW at that 
frequency. Input-output waveforms of the circuit are shown in Figure 6.10. In this figure, 
the following test vector is applied: A3A2 AjAq= 1010, 63626160=0101. The carry in 
signal is alternating between logic low and logic high levels. The selected input vectors 
force all outputs of the circuit to switch between logic states for each input change. All 
summations and carry out outputs precharge when <j>2 is logic low because they are taken 
from ^2 gates. The effects of the input changes appear at the ouqiuts after three clock cycles 
(the fill time of the circuit). 
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Figure 6.9: CGaAs TPDL 4-Bit CLA Logic Diagram 
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Figure 6.10: Input-Output Waveforms of CGaAs TPDL 4-Bit CLA 
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Figure 6.1 1: CGaAs TPDL Circuit for Generating Function Cq 
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Figure 6.12: CGaAs TPDL Circuit for Generating Function Cj 
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Figure 6.13; CGaAs TPDL Circuit for Generating Function C 2 
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D. COMPARISON BETWEEN CGaAs STATIC, PIPLINED STATIC AND TPUL 
4-Blt CLA 

In the previous sections, the design and HSPICE simulation of a 4-Bit CLA using static 
logic, piplined static logic and TPDL were completed. In this section, the comparison 
between these different designs for speed, power consumption and layout area will be 
performed. Table 6.1 lists the maximum operating frequency of each design and the power 
consumption at that frequency [5 1]. Also, the number of transistors used and the layout area 
of each circuit are listed in the table. The layout area listed in the table is the transistor gate 
are and does not include the interconnect area between the transistors. Also, layout area of 
the TPDL design does not include the area of the clock generator that is required for proper 
operation. The CGaAs TPDL CLA has the highest operating frequency of all the studied 
CGaAs CLA logic designs. The maximum frequency is more than double that of the 
piplined static adder and more than four times that of the static adder. The power 
consumption at the maximum frequency is less than the power consumed by the piplined 
adder at half of the maximum frequency. 

For the comparison to be fair, the layout area and the power consumption of the non- 
overlapped clock generator designed in Chapter HI, Section C and required for the 
operation of the TPDL circuits, should be added to the layout area and power consumption 
of the TPDL adder. The clock generator has a layout area of 81 (im and consumes an 
average power of 12 mW at 300 MHz and 25 mW at 1.0 GHz (power consumption of the 
clock generator at different frequencies is plotted in Figure 3.35). The transistor count for 

the TPDL adder will be 484 transistors and its layout area will be 1 190 |lm^. It is important 
to compare the power consumption of all circuits at the same frequency. The average power 
consumption of static, piplined static and TPDL adders at 0.26 GHz are 26 mW, 42.74 mW 
and 23.82 mW, respectively. At 550 MHz, the piplined static adder consumes 77.4 mW 
while the TPDL adder consumes 43.66 mW. The static adder will not work at all at this 
frequency. 

Figure 6.15 shows the power consumption of the three adder designs and the frequency 
range of their operation. From this figure, it can be seen that power consumption increases 
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as the frequency increases for the static adder and the TPDL adder. However, the rate at 
which power increases for the static circuit is greater than for the TPDL circuit. The power 
consumption increase for the static adder is linear with the increase in frequency. The rate 
of power consumption increase for the TPDL adder decreases as the frequency increases 
and approximates a logarithmic function. At any frequency, the power consumption of the 
TPDL adder is about half of that for the piplined static adder. It is also noted from the figure 
that the maximum frequency of the static adder is very low compared to that for the piplined 
or the TPDL designs. The delay-power product of both the static and the TPDL adders is 
plotted in figure 6.16. The power-delay product decreases with decreasing the power 
supply because of the decrease in the leakage current. 



TABLE 6.1: Comparison of CGaAs 4-Bit CLA Designs 



Used Logic 
Family 


Maximum 

Frequency 

[GHz] 


Power 

Consumption 

[mW] 


Layout Area 


Transistor 

Count 


Static 


0.26 


26 


989 


236 


Piplined 

Static 


0.55 


77.4 


1853 


516 


TPDL 


1.22 


61.79 


1109.5 


450 



Loading effects on the performance of the designed CLA circuits have also been 
studied. The three designs (static, piplined static and TPDL) of the CLA have been 
simulated in HSPICE with a 1.75 volt power supply. The input signals switch between 0.0 
volts and 1.75 volts. The output load was varied to measure the maximum operating 
frequency of the circuit when driving different loads. For the static and piplined static 
adder, the loads were static inverters. For the TPDL adder, the loads were TPDL inverters. 
The number of loads changed from one to ten and the maximum operating frequency of 
each adder was recorded for each load. Figure 6.17 shows HSPICE simulation results of 
the maximum frequency of operation for the three adders driving different loads. 
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For the static adder, the limiting parameter for the maximum frequency of operation is 
the propagation delay through the entire adder. Increasing the load will increase the output 
capacitance of the adder which increases the charging and discharging times of the output 
nodes. Therefore, the maximum frequency of the circuit decreases linearly with increasing 
output load from one to ten. 

For the piplined static adder, the limiting parameter for the maximum frequency of 
operation is the longest stage propagation delay. Fortunately, the longest delay of the three 
stages is for the middle stage. Increasing the load will only limit the maximum frequency 
of the last stage. Therefore, increasing the load from one to six will not affect the maximum 
frequency of the adder. As the load increases to seven, the propagation delay through the 
last stage becomes longer than for the middle stage and the last stage delay becomes the 
critical delay, which limits the maximum frequency of operation. Beyond a fan-out of 
seven, the maximum frequency decreases linearly with increasing load. 

For the TPDL adder, the load capacitance is separated from the output by a 
transmission gate. Thus, increasing the load capacitance will not increase the output 
capacitance of the TPDL circuit. The limiting factor for the maximum operating frequency 
is the charge redistribution problem. This problem is common for all the dynamic circuit 
designs. This adds another advantage for the TPDL designs. 

The power supply and input signal levels have also been varied to study their effect on 
the maximum operating frequency and the power consumption of the different logic 
designs of the 4-Bit CLA. The highest power supply voltage used in the HSPICE 
simulations is limited by the source-drain leakage current, while the highest input voltage 
level is limited by the gate leakage current of the transistors. The power supply and the 
peak-to-peak input voltage are varied from 1.75 volts to 1.00 volt in 0.25 volt steps. The 
maximum frequency of operation for each circuit, and its power consumption at that 
frequency for each power supply voltage, are listed in Table 6.2. The TPDL adder can 
function properly up to 292 MHz at a power supply of 1.00 volt. The power consumption 
is 2.1 mW, which is less than one-tenth of the power consumed by the static adder for 
proper functioning at the same frequency. 
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This concludes the circuit level design for both the static and dynamic circuits used in 
this research. The following chapter implements the designed circuits in this chapter and 
the previous chapters. Seven different integrated circuits have been implemented. The 
performance of all the implemented ICs is also included in Chapter VII. 




Figure 6.15: Power Consumption of 4-Bit CLAs 



TABLE 6.2: Performances of CGaAs 4-Bit CLA Designs 



Power 

Supply 

[V] 


Static Design 


Piplined Static 


TPDL Design 


F 

^ max 

[GHz] 


P 

^av 

[mW] 


F 

^ max 

[GHz] 


P 

^av 

[mW] 


F 

^ max 

[GHz] 


P 

* av 

[mW] 


1.75 


0.262 


26.00 


0.55 


77.40 


1.22 


61.79 


1.5 


0.217 


12.00 


0.413 


34.1 


1.09 


30.07 


1.25 


0.151 


4.82 


0.262 


12.0 


0.758 


12.39 


1.0 


0.091 


1.56 


0.135 


3.25 


0.292 


2.10 
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Figure 6.16: Power-Delay Product of 4-Bit CLAs 
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Figure 6.17: Loading Effects on CGaAs 4-Bit CLAs 
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VII. COMPLEMENTARY GaAs CIRCUIT IMPLEMENTATIONS 

AND TEST RESULTS 



This chapter discusses the implementation (layout) of the circuits that have been 
designed, analyzed and optimized in the previous chapters. All circuits have been laid out 
using CADENCE tools and a proprietary CGaAs technology file supplied by Motorola 
Semiconductor. All layouts have passed the design rule check. The technology file used for 
the design rule checking is also proprietary. Seven integrated circuits have been 
implemented, translated into the GDSn format and forwarded to Motorola for fabrication. 

All the implemented circuits use output drivers that are designed to drive a 50 Q load 
and up to 15 pf of parasitic capacitance. The circuits implemented will be discussed in the 
following sections. Six of the designed circuits are compatible with the Micromanipulator 
corp. Analytical Probe Station model 6100, which is available at the NPS for die probing. 
Due to the limited number of high frequency test probes that can be used simultaneously 
and the difficulty of generating off-chip, multi-bit, test vectors at high speed, it was 
required that the number of high frequency I/O pins be minimal. For this reason, a three- 
bit, Linear Feedback Shift Register (LFSR) was designed and implemented on-chip to 
generate the input vectors required for testing the functionality of the designed circuits. The 
LFSR has only one input, which is the clock signal, and generates three outputs that are 
used as inputs for the designed circuits. The seventh circuit is the TPDL Carry Lookahead 
Adder (CLA) and it will be packaged for testing after fabrication. 

All implemented circuits have been simulated in HSPICE to test their functionality. 
The transistor model parameters used in the simulations were supplied by Motorola and are 
representative of the devices manufactured by the Motorola complementary GaAs 
fabrication processes. The parameters were extracted from actual wafer probing data. Also, 
the simulation tool (HSPICE) has superior convergence and modeling accuracy. Therefore, 
the simulation results should not deviate significantly from the actual measured results 
obtained after fabricating the chips. Full functionality is expected. Speed should not vary 
more than + or - 35%. 
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HSPICE simulation files for all the implemented circuits are presented in Appendix 

A. The layouts of the implemented circuits are attached at the end of this chapter. The layout 
of the implemented chips, including the input and output pads, are presented in Appendix 

B. 

A. CGaAs INPUT RECEIVER AND OUTPUT DRIVER CIRCUITS 

The input receiver circuit is designed to minimize the loading effect on the circuit that 
drives the receiver. The output driver circuit is designed to drive a load of 50 Q with up to 
15 pf of parasitic capacitance. The input receiver and output driver test circuit has been 
designed and implemented without any additional circuits in between the receiver and 
driver to test their functionality, drive capability and maximum operating frequency. The 
receivers and drivers are used with the other implemented circuits. The maximum operating 
frequency of the I/O test circuit is 0.9 GHz and is limited by the driver. The receiver will 
operate at 1.0 GHz. This will limit the maximum frequency of any circuit that employs the 
drivers to 0.9 GHz. The gate lengths of all transistors in the driver are 0.7 |lm. The input 
receiver consists of four cascaded inverters with N- and P-channel transistor gate widths as 
follows; 6 pm and 5.6 pm for the first inverter, 12 pm and 12 pm for the second, and 36 
pm and 36 pm for both the third and the fourth inverters, respectively. The output driver 
also consists of four inverters with N- and P-channel transistor gate widths as follows; 6 
pm and 6 pm for the first inverter, 12 pm and 12 pm for the second, 24 pm and 24 pm for 
the third, and 90 pm and 60 pm for the fourth inverter, respectively. The circuit consumes 
an average power of 108 mW at 0.9 GHz from a 2.0 V power supply. The maximum 
operating frequency decreases to 0.76 GHz when decreasing the supply voltage to 1.75 V 
and the average consumed power drops to 69 mW at this frequency. The input and output 
waveforms of the circuit, with a supply voltage of 2.0 V, are shown in Figure 7.5. Layout 
of the circuit, including all pads, is shown in Appendix B. 
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B. CGaAs STATIC 3-BIT LFSR CIRCUIT 



In this section, the implementation of a three-bit LFSR is detailed. The LFSR circuit 
was required to minimize the number of input and output terminals of the implemented 
circuits described in the next sections. The design and optimization of the static 3-Bit LFSR 
is explained in Chapter IV. The circuit has one input (clock) and generates three outputs. 
The three outputs represent seven states (all the states except state 000) and is shown in 
Chapter IV, Table 4. 1 . These outputs are used as inputs for the static function generator 
(three-input circuit explained in the next section). The LFSR circuit consists of three D 
Flip-Flops and one XOR gate and is shown in Chapter IV, Figure 4.7. The circuit was laid 
out with an input receiver connected to the input and an output driver connected to each 
output of the circuit (three outputs). The maximum operating frequency of the implemented 
circuit is 0.55 GHz. The total power consumption (including the power consumption of the 
driver circuits) at the maximum operating frequency is 214 mW. The power consumption 
of the LFSR circuit by itself is explained in Chapter IV. Input and output waveforms of the 
implemented circuit are shown in Figure 7.6. The HSPICE simulation file for the circuit is 
included in Appendix A.2. The layout of the circuit is shown in Figure 7.14, while the 
layout of the entire circuit, including input and output pads, is presented in Appendix B. 

C. CGaAs STATIC TWO-LEVEL FUNCTION GENERATION 

In this section, a static logic circuit to generate the logic function 

Fj = ((A + B) + C), and explained in Chapter V, is implemented. The logic diagram of 

the implemented circuit is shown in Figure 7.1. The maximum operating frequency of the 
implemented circuit is limited by the maximum frequency of the LFSR (previous section) 
to 0.55 GHz. The circuit consumes an average power of 85.5 mW from a 2.0 V power 
supply at the maximum frequency. The input and output waveforms of the circuit, 
operating at maximum frequency, is shown in Figure 7.7. The HSPICE simulation file of 
the circuit is included in Appendix A.3. The layout of the function Fj is shown in Figure 
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7.15, while the layout of the entire circuit, including the input and output pads, is shown in 
Appendix B. 



Clock 



Input 




Figure 7.1: Logic Diagram of Static Logic Function Fj Generator 



D. CGaAs NON-OVERLAPING TWO-PHASE CLOCK GENERATOR 

In this section, the implementation of the clock generator designed in Section C of 
Chapter HI is described. The circuit has a clock input and generates the non-overlapped 
clock phases <t»] and <t»2 (”on overlapped in the logic low level) and their complements. 
These clock phases are required for the proper operation of the Two-Phase Dynamic FET 
Logic (TPDL) circuits. The maximum operating frequency of the implemented circuit is 
0.9 GHz (limited by the maximum operating frequency of the driver circuits). The total 
power consumption of the circuit at the maximum operating frequency, when powered 
from a 2.0 V power supply, is 339 mW (including the power consumption of the driver 
circuits). The HSPICE simulation file of the implemented circuit is included in Appendix 
A. 4. Input and output waveforms of the circuit operating at the maximum frequency is 
shown in Figure 7.8. The layout of the clock generator is shown in Figure 7.16, while the 
layout of the entire circuit, including all input and output pads, is presented in Appendix B. 

E. CGaAs TPDL 3-BIT LFSR CIRCUIT 

In this section, the implementation of the TPDL 3-Bit LFSR, designed in Section E of 
Chapter IV, is described. The LFSR design was required to minimize the number of input 
and output pins of the designed TPDL circuits. The logic diagram of the implemented 
circuit is shown in Figure 7.2. The circuit has one input (clock input) and three outputs. The 



150 



generated output sequence is shown in Table 4.1. The outputs are used as a test pattern to 
the inputs for the TPDL 3-input circuit explained in the next section. The LFSR circuit 
consists of two TPDL D Flip-Flops and one dynamic XOR (DXOR) gate. The DXOR gate, 
used as a separate stage, reduces the required number of D Flip-Flops by one. The 
maximum operating frequency of the implemented circuit is limited by the maximum 
frequency of the drivers to 0.9 GHz. The circuit consumes an average power of 320 mW 
from a 2.0 V power supply at the maximum operating frequency (including the power 
consumption of the on-chip drivers and clock generator). The power consumption of the 
LFSR circuit by itself is explained in Chapter FV. The HSPICE simulation file of the 
implemented circuit is presented in Appendix A.5. The input and output waveforms of the 
circuit are shown in Figure 7.9. The layout of the LFSR circuit is shown in Figure 7.17, 
while the layout of the entire circuit, including all the pads, is presented in Appendix B. 
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Figure 7.2: Logic Diagram of TPDL 3-Bit LFSR 



F. CGaAs TPDL TWO-LEVEL FUNCTION GENERATION 

In this section, the implementation of the TPDL circuit that generates the logic 

function Fj = {{A ■¥ B) C) , designed in Chapter IV, is discussed. The three inputs of 

the circuit are generated by a 3-bit LFSR. The maximum operating frequency of the 
implemented circuit is limited by the maximum frequency of the drivers to 0.9 GHz. The 
power consumption of the circuit at the maximum frequency, when powered from a 2.0 V 
power supply, is 172 mW (including the on-chip drivers, clock generator and the LFSR 
power consumption). The logic diagram of the implemented circuit is shown in Figure 7.3, 
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while the HSPICE simulation file of the circuit is presented in Appendix A. 6. The input and 
output waveforms of the circuit are shown in Figure 7. 10. The layout of the logic function 
FI generator circuit is shown in Figure 7.18, while layout of the entire circuit, including all 
pads, is presented in Appendix B. 



Clock 

Input 




Figure 7.3: Logic Diagram of TPDL Logic Function Fj Generator 



G. CGaAs TPDL 4-BIT CARRY LOOKAHEAD ADDER CIRCUIT 

The design and optimization of the TPDL four-Bit Carry Lookahead Adder (TPDL 4- 
Bit CLA) was presented in Chapter VI, Section C. The implementation of the circuit, 
including the drivers and the two-phase clock generator, is presented in this section. The 
logic diagram of the implemented circuit is shown in Figure 7.4. The maximum frequency 
of the circuit is limited by the drivers (static circuit) to 0.92 GHz. The circuit, including the 
clock generator and the drivers, consumes an average power of 800 mW from a 1.75 V 
power supply at the maximum operating frequency. The HSPICE simulation file is 
presented in Appendix A. 7. The input and output waveforms of the circuit operating at the 
maximum frequency are shown in Figure 7.1 1. The layout of logic function Fj is shown in 
Figure 7. 19, while the layout of the entire circuit is presented in Appendix B. 
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Figure 7.4: Logic Diagram of TPDL 4-Bit CLA 




Figure 7.5: Input-Output Waveforms of Input Receiver and Output Driver IC 
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Figure 7.6: Input-Output Waveforms of Static 3-Bit LFSR IC 
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Figure 7.8: Input-Output Waveforms of Two-Phase Clock Generator IC 
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Figure 7.9: Input-Output Waveforms of TPDL 3-Bit LFSR IC 
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Figure 7. 10: Input-Output Waveforms of TPDL Function Fj IC 
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Figure 7.11: Input-Output Waveforms of TPDL 4-Bit CLA 
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Figure 7.12: Static XOR Gate 
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Figure 7. 14; Static 3-Bit LFSR 
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Figure 7. 15: Static Function Fj Generator 
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Figure 7.16: Two-Phase Clock Generator 
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Figure 7.17: TPDL 3-Bit LFSR 
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Figure 7.18: TPDL Function F^ Generator 
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Figure 7.19: TPDL 4-Bit CLA 
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VIII. CONCLUSIONS AND CONTINUATION OF WORK 



A. CONCLUSIONS 

In this dissertation, the design, analysis and implementation of different experimental 
Complementary Galium Arsenide (CGaAs) dynamic logic circuits has been documented. 
The designed circuits are compatible with existing CGaAs fabrication process and design 
tools. Dynamic logic circuits offer several advantages over typical static logic circuits. 
They have a higher speed than GaAs MESFET Directly Couple FET Logic (DCFL) and 
lower power consumption than complementary logic. The non-ratioed nature of dynamic 
logic reduces the layout area. 

In the first part of this dissertation. Two- Phase Dynamic FET Logic (TPDL), a new 
dynamic logic family for CGaAs was presented and implemented. Also, Domino and N-P 
Domino dynamic logic were implemented in CGaAs. Four different logic functions were 
designed and implemented using static and dynamic logic families for comparison 
purposes. The transistor source-drain leakage current limits the power supply voltage for 
all circuits to 2.0 V. The transistor gate-leakage current limits the peak-to-peak input signal 
transitions to 1.75 V. Beyond these voltage levels, the power consumption of the circuits 
increases dramatically with a small improvement in the maximum operating frequency. 

CGaAs static logic designs are similar to the designs of silicon CMOS circuits. It is 
the simplest logic design because it has established design procedures. The maximum 
operating frequency of the implemented circuits in static logic is 620 MHz. Domino 
dynamic logic circuits have the smallest layout area. The use of a static inverter in Domino 
logic gates increases the power consumption. Also, only non-inverting functions can be 
implemented using Domino logic. The maximum operating frequency of the designed 
Domino logic circuits is 1.6 GHz. N-P Domino logic uses the slow P-channel transistor in 
evaluating the function, which enhances the disadvantage of GaAs and limits the maximum 
operating frequency of the designed circuits to 820 MHz. 
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TPDL uses only the fast N-channel transistors in evaluating the logic function. The 
slow P-channel transistors are used only to precharge the output nodes. Because of the pass 
gates in front of each evaluating circuit, TPDL designs are self latching and are suitable for 
pipelined architectures. Therefore, any circuit can be pipelined to reach the maximum 
frequency of operation without adding any storage elements (pipeline registers). The 
maximum frequency of the designed TPDL logic circuits is 2.38 GHz. 

In the second part of the dissertation, the basic combinational logic gates (Inverter, 
NAND gate, NOR gate, XOR gate and XNOR gate) are designed and implemented in 
TPDL and static logic. These gates are used in the design and the fabrication of more 
complex circuits. TPDL gates have a maximum operating frequency of 2.38 GHz, while 
the static logic gates operate up to 1.2 GHz. The power consumption of TPDL gates is less 
than one-fourth that of the static logic gates when powered from the same power supply 
and having the same fan-out. The layout area of the TPDL logic gates is about one-half that 
of the static logic gates. Also, the D Flip Flop (D-FF) and Linear Feedback Shift Registers 
(LFSRs) are designed and implemented in TPDL and static logic. The comparison in 
performance between these circuits is also documented. 

In the last part of this dissertation, a four-Bit Carry Lookahead Adder (4-Bit CLA) was 
designed and implemented in static logic, pipelined static logic and TPDL. The designed 
static 4-Bit CLA circuit has a maximum frequency of 260 MHz and consumes an average 

power of 26 mW from a 1.75 V power supply at this frequency. Its layout area is 989 jim^. 

i 

The maximum frequency is limited by the propagation delay through the entire circuit. A 
three-stage, pipelined. Static, 4-Bit CLA was designed to overcome this problem. The 
maximum operating frequency is 550 MHz and the power consumption at this frequency 

is 77 mW. The layout area is 1853 |im^. The maximum frequency of this circuit is limited 
by the longest stage propagation delay. The added pipeline registers between stages 
increased the power consumption and the layout area. The TPDL 4-Bit CLA operates 
correctly up to 1.2 GHz and consumes an average power of 61 mW from a 1.75 V power 
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supply at this frequency. The layout area is 1 109 pm . Comparing these designs, the TPDL 
circuit consumes about half of the power consumption of the pipelined static circuit at the 
same frequency. The TPDL adder maximum operating frequency is more than double that 
of the pipelined static adder. 

The comparison in performance among all the studied dynamic and static logic 
families is completed through the implemented circuits. TPDL circuits have the highest 
maximum operating frequency and the lowest power consumption ever reported in this 
technology. Also, the layout area is about half that of the static logic circuit and about the 
same as that of the Domino and N-P Domino logic circuits. The power consumption of the 
TPDL circuits is less than one-fourth that of the static logic circuits and less than one-half 
that of the Domino and N-P Domino logic circuits implementing the same logic function 
and powered from the same supply voltage. However, TPDL circuits can function properly 
up to the maximum operating frequency of the other logic designs with lower power supply 
voltages and less power consumption. The main disadvantage of the TPDL design is that it 
requires two non overlapped clock phases and their complements for proper operation. 
Also, routing these four clock phases to aU of the circuit increases the design complexity. 
A 1.0 GHz clock generator is designed and implemented in this dissertation. It generates 
the two non overlapped clock phases and their complements required for the operation of 
the TPDL circuits. 

Loading effects on the maximum frequency of all the designed circuits is also studied. 
In TPDL circuits, loads are isolated from the circuit outputs by the pass gates. Thus, 
increasing the output load will increase the output capacitance of all the designed logic 
circuits except for TPDL. Therefore, TPDL circuits are the least effected by increasing the 
output load. Also, the effects of changing the power supply on the maximum frequency and 
the power consumption of all the designed circuits are studied. The results presented in this 
dissertation show that the TPDL circuits are the best performing circuits over all the studied 
logic families when reducing the power supply voltage down to 1.0 V. 
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This dissertation concludes that the TPDL circuits are the candidate for the next 
generation of high speed, high density and low power CGaAs ICs. The results of the 
described research allow CGaAs technology to be used for implementing VLSI ICs for the 
first time. 

B. CONTINUATION OF WORK 

As can be seen from this dissertation, TPDL circuits have the highest operating 
frequency with the lowest power consumption. Also, they have a reduced layout area 
compared with the other designs. This introduces the CGaAs technology into the LSI and 
VLSI era. 

Testing the functionality of the designed and fabricated circuits will confirm the 
simulation results presented in this dissertation. The maximum operating frequency and the 
power consumption of the fabricated circuits are the principle parameters to be measured. 
Also, testing should include the effects of changing the power supply voltage on the 
maximum operating frequency and the power consumption of all fabricated circuits. The 
performance comparisons between the complementary static and the TPDL circuits also 
needs to be performed. Comparing the test results against the simulation results presented 
in this dissertation should also be done. The effects of radiation on both the static and TPDL 
circuits is also of interest. After finishing the test phase, the design and implementation of 
more complex TPDL logic functions, such as used on a high-speed, pipelined DSP ASIC 
(Application Specific IC), like a FIR filter or a digital communication integrated circuit 
should be attempted. 
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APPENDIX A: HSPICE SIMULATION FILES 



This appendix contains the HSPICE simulation files of all the implemented 
complementary GaAs integrated circuits. 
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A.l CGaAs INPUT/OUTPUT DRIVER IC 



* File name : inout.sp 

* Directory name : shehata/thesis/runl/inout 

* Minimum clock time :1.02ns 

* Total average power : 1 15mW 

* Last correction date : Dec./12/95 

* File name : inout.sp 

* File directory : shehata/thesis/runl/inout 

*Min clock time=1.02ns , total average power= 115mW 
.include /home5/shehata/thesis/run 1/parameters 
*power supply 
vdd vdd 0 2.0 

* input signal 

vin clock 0 pulse(0 1.75 0ns 0.01ns 0.01ns 0.5ns 1.02ns) 

** INPUT RECEIVER CIRCUIT ** 

.SUBCKT INVIl in out vdd 0 
JO out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J1 out in 0 0 tn0.7xl0 L=0.7U W=5.6U 
.ENDS INVIl 

.SUBCKT INVI2 in out vdd 0 
JO out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J1 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
J2 out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J3 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
.ENDS INVI2 
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.SUBCKT INVI3 in out vdd 0 
JO out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J1 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
J2 out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J3 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
J4 out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J5 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
J6 out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J7 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
J8 out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J9 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
JIO out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 
J1 1 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
.ENDS INVI3 

.SUBCKT INPUTDRIVER in out vdd 0 

XI in 4 vdd 0 INVIl 

X2 4 5 vdd 0 INVI2 

X3 5 6 vdd 0 INVI3 

X4 6 out vdd 0 INVI3 

.ENDS INPUTDRIVER 

** OUTPUT DRIVER CIRCUIT ** 

.SUBCKT INVOl in out vdd 0 

JO out in vdd vdd tp0.7xl0 L=0.7U W=6.0U 

J1 out in 0 0 tn0.7xl0 L=0.7U W=6.0U 
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ENDS INVOl 



.SUBCKT INV02 in out vdd 0 
JO out in vdd vdd tpOJxlO L=0.7U W=6U 
J1 out in 0 0 tn0.7xl0 L=0.7U W=6U 
J2 out in vdd vdd tp0.7xl0 L=0.7U W=6U 
J3 out in 0 0 tn0.7xl0 L=0.7U W=6U 
.ENDS INV02 

.SUBCKT INV03 in out vdd 0 
JO out in vdd vdd tp0.7xl0 L=0.7U W=6U 
J1 out in 0 0 tn0.7xl0 L=0.7U W=6U 
J2 out in vdd vdd tp0.7xl0 L=0.7U W=6U 
J3 out in 0 0 tn0.7xl0 L=0.7U W=6U 
J4 out in vdd vdd tp0.7xl0 L=0.7U W=6U 
J5 out in 0 0 tn0.7xl0 L=0.7U W=6U 
J6 out in vdd vdd tp0.7xl0 L=0.7U W=6U 
J7 out in 0 0 tn0.7xl0 L=0.7U W=6U 
J8 out in vdd vdd tp0.7xl0 L=0.7U W=6U 
J9 out in 0 0 tn0.7xl0 L=0.7U W=6U 
JIO out in vdd vdd tp0.7xl0 L=0.7U W=6U 
J1 1 out in 0 0 tn0.7xl0 L=0.7U W=6U 
.ENDS INV03 

.SUBCKT INV04 in out vdd 0 
JO out in vdd vdd tp0.7xl0 L=0.7U W=10U 
J1 out in vdd vdd tp0.7xl0 L=0.7U W=10U 
J2 out in vdd vdd tp0.7xl0 L=0.7U W=10U 
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